Received: by 10.223.164.202 with SMTP id h10csp5333476wrb; Tue, 21 Nov 2017 08:09:47 -0800 (PST) X-Google-Smtp-Source: AGs4zMaNvPt/V+h/ZzIQDUDpPuUQuufB6BSViVcRkpeJ6NaOvE8pzUVr6AyrrMSV8gDYXu1wdz0K X-Received: by 10.99.53.3 with SMTP id c3mr16766830pga.194.1511280587530; Tue, 21 Nov 2017 08:09:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511280587; cv=none; d=google.com; s=arc-20160816; b=iPKqib0JhVH+zztKNjSLn+K93bW/gmuUgQ7+ui/v3WKtenQ10OUqH2KyEijM4fYBSL gjOI2aSoRDyur92FMFDj5BQ0D72x3x7GYKiYTYHLmlT8Acbp+omONWmSXSgkNS6KVGvb 1vGgy3WCyTPDGelEMEXkabTxKxyy8QMNu/s86SARSn6Y7qMFYDeYGqiiFRmLLqCW4vIF a2RyTRc8zDkiM3riJ9Nrjq4b7R1PfhfUS2BabMqg6idWiOUFTWUe3FkHSzluvf4Mcfs2 c9DWKdhvb1F0/QxPmRo8xOFprVEBRkLTNoenYEQ6IHA6Iz7HByJAEWORn2bNoriSZ6++ FBpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :from:references:cc:to:subject:arc-authentication-results; bh=yveDUPqcj+SR2PbT76OP2i6zdSFOmKOpVayOUjjF4UA=; b=HHW7hyqJrcfwXqI0JDVQLwHPO72iOYtq6NjpbQw430YquMPg7VfjCyq88ZVoL6BivT 6VrAnDm9jW8c7Kk0grze4rExFKZL/UMemDZZa4nJ04QrmMADOrJ/VOipgu6oflc8C3Pk ppzDM3RzUO06XFqr9ify+JoKxIQrYuFN+1WIyi1NfEWk848DGjkKoYmf4ZbSR+OG9jJi nFajZVMW6ryIYjBrtNGNKQwyeypYh6e3SDRbwRKb5eNXcvXjbmh0yq/YlBnYAPsSRZnW TzCLm24RuAtn4y71tYHJiGTA/GdZtCWrk7MZv4BD7jZ+JtkIZUMMLUIFLdojuUTa416S R8FA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1si11180238plz.646.2017.11.21.08.09.36; Tue, 21 Nov 2017 08:09:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751831AbdKUQIU (ORCPT + 76 others); Tue, 21 Nov 2017 11:08:20 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:47496 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751796AbdKUQIS (ORCPT ); Tue, 21 Nov 2017 11:08:18 -0500 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vALG61vB112131 for ; Tue, 21 Nov 2017 11:08:18 -0500 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0a-001b2d01.pphosted.com with ESMTP id 2ecp2qx10p-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 21 Nov 2017 11:08:16 -0500 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 21 Nov 2017 09:08:09 -0700 Received: from b03cxnp07028.gho.boulder.ibm.com (9.17.130.15) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 21 Nov 2017 09:08:06 -0700 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id vALG847k4915566; Tue, 21 Nov 2017 09:08:04 -0700 Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9EB60136048; Tue, 21 Nov 2017 09:08:04 -0700 (MST) Received: from oc8043147753.ibm.com (unknown [9.60.75.228]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP id 57ABC136044; Tue, 21 Nov 2017 09:08:02 -0700 (MST) Subject: Re: [RFC 00/19] KVM: s390/crypto/vfio: guest dedicated crypto adapters To: Cornelia Huck Cc: Pierre Morel , linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, freude@de.ibm.com, schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com, borntraeger@de.ibm.com, kwankhede@nvidia.com, bjsdjshi@linux.vnet.ibm.com, pbonzini@redhat.com, alex.williamson@redhat.com, alifm@linux.vnet.ibm.com, mjrosato@linux.vnet.ibm.com, qemu-s390x@nongnu.org, jjherne@linux.vnet.ibm.com, thuth@redhat.com, pasic@linux.vnet.ibm.com References: <1507916344-3896-1-git-send-email-akrowiak@linux.vnet.ibm.com> <5baf5f90-6cac-3c09-7b66-1bc8b30b8093@linux.vnet.ibm.com> <20171114145722.4ab850a5.cohuck@redhat.com> <8a492b07-3d3b-f4cf-e139-7de345ea8188@linux.vnet.ibm.com> <20171116180308.289e5eed.cohuck@redhat.com> <1476b0a4-a2a3-2c48-107a-ab7b39b0e93e@linux.vnet.ibm.com> <0b40cee7-78d3-f96b-ab46-1f40b31251cb@linux.vnet.ibm.com> <20171117110742.1d416435.cohuck@redhat.com> <20171120181345.3fbda311.cohuck@redhat.com> From: Tony Krowiak Date: Tue, 21 Nov 2017 11:08:01 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20171120181345.3fbda311.cohuck@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-TM-AS-GCONF: 00 x-cbid: 17112116-0012-0000-0000-00001556E7B7 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008103; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000240; SDB=6.00949151; UDB=6.00479334; IPR=6.00729431; BA=6.00005706; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00018124; XFM=3.00000015; UTC=2017-11-21 16:08:08 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17112116-0013-0000-0000-0000505D6222 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-21_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711210214 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/20/2017 12:13 PM, Cornelia Huck wrote: > On Fri, 17 Nov 2017 15:28:16 -0500 > Tony Krowiak wrote: > >> On 11/17/2017 05:07 AM, Cornelia Huck wrote: >>> On Fri, 17 Nov 2017 08:07:15 +0100 >>> Pierre Morel wrote: >>> >>>> On 17/11/2017 00:35, Tony Krowiak wrote: >>>>> On 11/16/2017 03:25 PM, Pierre Morel wrote: >>>>>> On 16/11/2017 18:03, Cornelia Huck wrote: >>>>>>> On Thu, 16 Nov 2017 17:06:58 +0100 >>>>>>> Pierre Morel wrote: >>>>>>>> So I totally agree with Conny on that we should stabilize the >>>>>>>> bus/device/driver modeling. >>>>>>>> >>>>>>>> I think it would be here a good place to start the discussion on things >>>>>>>> like we started to discuss, Harald and I, off-line: >>>>>>>> - why a matrix bus, in which case we can avoid it >>>>>>> I thought it had been agreed that we should be able to ditch it? >>>>>> I have not see any comment on the matrix bus. >>>>> As stated in a previous email responding to Connie, I decided to scrap the >>>>> AP matrix bus. There will only ever be one matrix device that serves two >>>>> purposes: To hold the APQNs of the queue devices bound to the VFIO AP >>>>> matrix >>>>> device driver; to serve as a parent of the mediated devices created for >>>>> guests requiring access to the APQNs reserved for their use. So, instead >>>>> of an AP matrix bus creating the matrix device, it will be created by the >>>>> VFIO AP matrix driver in /sys/devices/ap_matrix/ during driver >>>>> initialization. >>>> Sorry, I did not see the mail, this of course change a lot of things... >>> One thing that would be useful for the next iteration is some ascii-art >>> representation that shows how the different parts (matrix, ap driver, >>> mdev, ...) tie together. That also would be useful to have in the >>> documentation. >> I plan on including some drawings with the documentation and will include it >> in the cover letter as well. > Sounds good. > >>>>>>>> - how to handle the repartition of queues on boot, reset and hotplug >>>>> What do you mean by repartition of queues on boot? >>>>>>> That's something I'd like to see a writeup for. >>>>>> yes, and it may have an influence on the bus/device/driver/mdev design >>>>> I don't understand the need to avoid implementation details. If you recall, >>>>> the original design was modeled on AP queue devices. It was only after >>>>> implementing that design that the shortcomings were revealed which is >>>>> why we decided to base the model on the AP matrix. Keep in mind, this is >>>>> an RFC, not a final patch set. I would expect some change from the >>>>> implementation herein. In fact, I've already made many changes based on >>>>> Connie's and Christian's review comments, none of which resulted in an >>>>> overhaul of the design. >>> I expect that any of the above can be accommodated by the design. A >>> short writeup of what we may want to do for that would certainly help >>> to validate that, though. >> I have spent some time thinking about hotplug implementation and I >> believe it can be accommodated within this design. I haven't looked >> at the implications for reset yet and I don't really know what is >> meant by "repartition of queues". I will include a write-up in the >> next submission. > FWIW, "repartition of queues" is also unclear to me. > >>>>>>>> - interruptions >>>>>>> My understanding is that interrupts are optional so they can be left >>>>>>> out in the first shot. With the gisa (that has not yet been posted), it >>>>>>> should not be too difficult, no? >>>>>> you are right I forgot that it is optional >>>>> If the facilities bit (STFLE.65) indicating interrupts are available is not >>>>> set for the guest, then the AP bus running on the guest will poll and >>>>> interrupts will not have to be handled. This patch set does not enable >>>>> interrupts, so it is not relevant at this time. We will not be able to >>>>> handle interrupts for the guest until the GISA for passthrough patches >>>>> are available. This will be addressed at that time. >>> If you think it can be easily added later on, that would be fine for >>> me. (I cannot comment on gisa details until it has been posted, >>> obviously.) >> Enabling AP interrupts is accomplished using the PQAP(AQIC) instruction >> which is a mandatory interception. The instruction will be forwarded to >> the VFIO AP device driver via an ioctl call on the mediated matrix >> device file descriptor. There will be some GISA set up needed and code >> to feed the interrupt back to user space, but I believe that will be >> provided by the forthcoming GISA passthrough patches. The bottom line is, >> I don't anticipate any major design change to handle interrupt processing. > Cool, that's what I wanted to hear. > >>>>>>>> - virtualization of the AP >>>>>>> Is this really needed? It would complicate everything a lot. >>>>>> Concern has no sens without interception. >>>>> Virtualization of AP is not on the table right now. >>>> If we implement interception, we must speak about this, even to say how >>>> we do not implement virtualization. >>> A note that we do not plan to virtualize it right now would be >>> sensible, yes. >> Will do. >>> From what I remember, this would mean opening a huge can of worms for >>> something that might be only of limited use. I'd prefer a simplistic >>> but usable approach first. If virtualization should really become a >>> requirement in the future, it might be better served by a different >>> mechanism anyway. >> I have done a little proof of concept code to get an idea if the AP matrix >> design will be extensible to handle virtualization. I modeled the >> proof of concept on the AP matrix model by creating a second mediated >> matrix device type for virtualization. Of course, virtual and passthrough >> matrix device types would have to be mutually exclusive; the admin would >> have >> to choose one or the other. The sysfs model looked like this: >> >> /sys/devices/ap_matrix >> ... [matrix] >> ...... [mdev_supported_types] >> ......... [vfio_ap_matrix-virtual] >> ............ create >> ............... [devices] >> .................. [$uuid] >> ..................... assign_adapter >> ..................... assign_domain >> >> Using the a assign_adapter file, one can assign a virtual adapter >> ID to one or more real adapter IDs. For example, to assign virtual adapter >> 4 to real adapters 3, 22 and 254: >> >> echo 4:3,22,254 > assign_adapter >> >> Using the a assign_domain file, one can assign a virtual domain >> ID to one or more real domain IDs. For example, to assign virtual domain >> 0 to real domains 8 and 71: >> >> echo 0:8,0x47 > assign_domain >> >> All AP instructions would be intercepted for a virtual matrix. The >> intercepted >> instructions would be forwarded to the VFIO AP matrix device driver by QEMU >> using an ioctl implemented by the VFIO AP matrix driver. If the mediated >> matrix >> device is vfio_ap_matrix-passthrough type, things would work as they do now. >> If the type is vfio_ap_matrix-virtual, the the driver would: >> >> 1. Calculate all of the real APQNs that can be used by: >> * Retrieving the adapter IDs mapped to the APID specified in the APQN >> contained in the AP instruction >> * Retrieving the domain IDs mapped to the APQI specified in the APQN >> contained in the AP instruction >> * Combining all of the permutations of APID/APQI >> 2. Determine which APQN would be best to use. >> 3. Execute the instruction >> 4. Return the result to the caller >> >> In other words, I think the current design is extensible; but even if not, >> I see no reason we can't design a completely different mechanism for >> virtualization. > So it's basically a one-time effort at (re)configuration, and the > virtualization facility will basically take care of the rest? I am not quite sure what you are asking, but I'll attempt to answer what I think you're asking. A new type of mediated matrix device will be introduced to configure a virtual matrix for a guest that provides the interfaces to map a virtual adapter/domain ID to one or more real adapter/domain IDs. If by virtualization facility, you are talking about the VFIO AP matrix driver, then yes, the driver will handle ioctl requests based on the type of the mediated matrix device through which the request was submitted: If the request is to configure the KVM guest's matrix: * If the mediated matrix device type is passthrough: * Do validation of matrix * Configure the APM, AQM and ADM in the KVM guest's CRYCB according to the configuration specified via the mediated device's sysfs attribute files. * If the mediated matrix device type is virtual: * Do validation of matrix * No need to configure CRYCB since all instructions will be intercepted If the request is to execute an intercepted AP instruction: * If the mediated matrix device type is passthrough: * Forward the instruction to the AP device and return the result to QEMU. * If the mediated matrix device type is virtual: * Retrieve all of the real APQNs mapped to the virtual adapter and domain IDs configured in the mediated matrix device's sysfs attribute files * If there is more than one APQN mapping, then determine which would be best to use - algorithm TBD * Forward the instruction to the AP device and return the result. Of course, these are just preliminary ideas at this time. I've only prototyped the sysfs configuration interfaces. No back end prototyping has been undertaken yet. If the ideas do not pan out, however; I think virtualization can be introduced as an independent design. > From 1584606043199250826@xxx Mon Nov 20 17:14:45 +0000 2017 X-GM-THRID: 1581165300547546289 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread