Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752111AbdHRXIC (ORCPT ); Fri, 18 Aug 2017 19:08:02 -0400 Received: from mail-hk2apc01on0132.outbound.protection.outlook.com ([104.47.124.132]:7584 "EHLO APC01-HK2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751852AbdHRXIA (ORCPT ); Fri, 18 Aug 2017 19:08:00 -0400 From: Dexuan Cui To: Stefan Hajnoczi CC: "Jorgen S. Hansen" , "davem@davemloft.net" , "netdev@vger.kernel.org" , "gregkh@linuxfoundation.org" , "devel@linuxdriverproject.org" , KY Srinivasan , Haiyang Zhang , "Stephen Hemminger" , George Zhang , Michal Kubecek , Asias He , "Vitaly Kuznetsov" , Cathy Avery , "jasowang@redhat.com" , Rolf Neugebauer , Dave Scott , "Marcelo Cerri" , "apw@canonical.com" , "olaf@aepfle.de" , "joe@perches.com" , "linux-kernel@vger.kernel.org" , Dan Carpenter Subject: RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default Thread-Topic: [PATCH] vsock: only load vmci transport on VMware hypervisor by default Thread-Index: AdMXLqHUpz8ZGmCVQCq3Yks74VajMAAMewiAAALTwAAAF9gwEAAbKDYAAA5QEwA= Date: Fri, 18 Aug 2017 23:07:37 +0000 Message-ID: References: <20170817135559.GG5539@stefanha-x1.localdomain> <04460E3B-B213-4090-96CD-00CEEBE6AC32@vmware.com> <20170818153716.GB17572@stefanha-x1.localdomain> In-Reply-To: <20170818153716.GB17572@stefanha-x1.localdomain> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Enabled=True; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SiteId=72f988bf-86f1-41af-91ab-2d7cd011db47; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Ref=https://api.informationprotection.azure.com/api/72f988bf-86f1-41af-91ab-2d7cd011db47; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Owner=decui@microsoft.com; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_SetDate=2017-08-18T16:07:36.0329460-07:00; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Name=General; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Application=Microsoft Azure Information Protection; MSIP_Label_f42aa342-8706-4288-bd11-ebb85995028c_Extended_MSFT_Method=Automatic; Sensitivity=General authentication-results: spf=none (sender IP is ) smtp.mailfrom=decui@microsoft.com; x-originating-ip: [167.220.1.87] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;KL1P15301MB0007;6:ZyR2tV79ltuuDfQmH5yVDhHp4eyLYHFCZYwhaOFOW2Vl5lFcRG73rYEY+eZYx3mHCVAgbpRnw/wE9LVTlG6LSbybXBLcrZ55Y1e47QS2OjWUox5Ps20b+8ovqTHocGtFKIhR8+PI+ywCbEB6e/NiKWViLlsX32yEikvDSfqqT+GLPX5CeOD24hyetk6ZZhGw4//Yb5HjoYM9MKR5mqJYvQzhK7q9WVRqEfqWV6ZOFwZSdWNTAKpNIbskPhad5mh+5T7ueBbWF4HMzD21jjl63v7meW/WKatJgZw8m0Pj9KFR07uVrZ2UYAxNjz1uDwjx8mhxJL6AkYqeeaDpDl+yvA==;5:+/NIiZgVBGGOS9kcRiv5F841JgkNhadZMA3Oc+QTIk3xQ7918TWGVMTKNW/xPBH29r0jAHPcEXUduRW16uC9u1D5ujLJQt0ms4SY+ZDQXEb/5QqD9rhRwbLazMY53caTOzNi1uPhHGFHevb4fOI4Wg==;24:sc6XWQvst2Y3msTYE+mYMVh+QBbvQ457RVlN9/yLGlOvvyMR3Ugoy0BSE1YcpEBhRY4yflF0XXsGzGgZhBJ56k8Cvj5axc3T0RcVSq9SicA=;7:2fw9hwhMMrOo7SW21wkz4sII7C4sD1iTHpLi2CKrbB9fz57wU4D54kjrtATUcvO9nmZRJd5c/LETvyAGvTzQLCrgBACO1uwdeULCme4OuB6HYFz7gdbUFsTJ2m90BozTHJLiOb12GTDxtht2fdpM6Y9wBl3r92QGEFgu3FMqp3F1/55829grZHmFRQtUzVUHFgtTUPnHJWg3RXQ04IhmMOPYJ/+Z6vs6II4zpRey4SE= x-ms-office365-filtering-correlation-id: 4e0ade92-121b-4632-0d66-08d4e68debc6 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603031)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:KL1P15301MB0007; x-ms-traffictypediagnostic: KL1P15301MB0007: x-exchange-antispam-report-test: UriScan:(278428928389397)(166708455590820)(21532816269658)(17755550239193); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(61425038)(6040450)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(3002001)(10201501046)(93006095)(93001095)(6055026)(61426038)(61427038)(6041248)(20161123555025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123558100)(20161123560025)(20161123564025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:KL1P15301MB0007;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:KL1P15301MB0007; x-forefront-prvs: 040359335D x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39860400002)(47760400005)(199003)(189002)(7416002)(189998001)(74316002)(97736004)(66066001)(14454004)(25786009)(3660700001)(3280700002)(33656002)(86362001)(10290500003)(10090500001)(86612001)(5005710100001)(106356001)(966005)(2906002)(8990500004)(3846002)(102836003)(6116002)(105586002)(7736002)(305945005)(5660300001)(7696004)(478600001)(4326008)(8936002)(81156014)(6436002)(81166006)(6506006)(110136004)(77096006)(6246003)(101416001)(8676002)(9686003)(55016002)(54906002)(6306002)(229853002)(53936002)(54356999)(50986999)(93886005)(76176999)(2900100001)(6916009)(68736007)(2950100002)(561944003);DIR:OUT;SFP:1102;SCL:1;SRVR:KL1P15301MB0007;H:KL1P15301MB0008.APCP153.PROD.OUTLOOK.COM;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Aug 2017 23:07:37.9859 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: KL1P15301MB0007 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id v7IN87xc005615 Content-Length: 3151 Lines: 72 > From: Stefan Hajnoczi [mailto:stefanha@redhat.com] > > CID is not really used by us, because we only support guest<->host > communication, > > and don't support guest<->guest communication. The Hyper-V host > references > > every VM by VmID (which is invisible to the VM), and a VM can only talk to > the > > host via this feature. > > Applications running inside the guest should use VMADDR_CID_HOST (2) to > connect to the host, even on Hyper-V. I have no objection, and this patch does support this usage of the user-space applications. > By the way, we should collaborate on a test suite and a vsock(7) man > page that documents the semantics of AF_VSOCK sockets. This way our > transports will have the same behavior and AF_VSOCK applications will > work on all 3 hypervisors. I can't agree more. :-) BTW, I have been using Rolf's test suite to test my patch: https://github.com/rn/virtsock/tree/master/c Maybe this can be a good starting point. > Not all features need to be supported. For example, VMCI supports > SOCK_DGRAM while Hyper-V and virtio do not. But features that are > available should behave identically. I totally agree, though I'm afraid Hyper-V may have a little more limitations compared to VMware/KVM duo to the <--> mapping. > > Can we use the 'protocol' parameter in the socket() function: > > int socket(int domain, int type, int protocol) > > > > IMO currently the 'protocol' is not really used. > > I think we can modify __vsock_core_init() to allow multiple transport layers > to > > be registered, and we can define different 'protocol' numbers for > > VMware/KVM/Hyper-V, and ask the application to explicitly specify what > should > > be used. Considering compatibility, we can use the default transport in a > given > > VM depending on the underlying hypervisor. > > I think AF_VSOCK should hide the transport from users/applications. Ideally yes, but let's consider the KVM-on-KVM nested scenario: when an application in the Level-1 VM creates an AF_VSOCK socket and call connect() for it, how can we know if the app is trying to connect to the Level-0 host, or connect to the Level-2 VM? We can't. That's why I propose we should use the 'protocol' parameter to distinguish between "to guest" and "to host". With my proposal, in the above scenario, by default (the 'protocol' is 0), we choose the "to host" transport layer when socket() is called; if the userspace app explicitly specifies "to guest", we choose the "to guest" transport layer when socket() is called. This way, the connect(), bind(), etc. can work automatically. (Of course, the default transport for a give VM can be better chosen if we detect which nested level the app is running on.) > Think of same-on-same nested virtualization: VMware-on-VMware or > KVM-on-KVM. In that case specifying VMCI or virtio doesn't help. > > We'd still need to distinguish between "to guest" and "to host" > (currently VMCI has code to do this but virtio does not). > > The natural place to distinguish the destination is when dealing with > the sockaddr in connect(), bind(), etc. > > Stefan Thanks, -- Dexuan