Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4682824pxj; Tue, 25 May 2021 13:53:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxrzKrnEmT1i+tY9c2cCbjjDJYkd0OVGUGJI5ONVdoPPecnpBjCb58LX0d845L3OuA82pOL X-Received: by 2002:aa7:dc12:: with SMTP id b18mr33484830edu.52.1621976007894; Tue, 25 May 2021 13:53:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621976007; cv=none; d=google.com; s=arc-20160816; b=Bs8ykmiNE9eRpuFjgsBpf0ThIIgslnuXiHB+lWdUi9zJI2YVrbb70h9tS4UdKqyRZp vJf5PMT/yHQi6Hrza6zJqpkXHnlrDrg/8lETwt9rtxueRKOMJZ/5G/wbqoWFL1Uxj2Kx PaBcuR6nQ6rVqc9vJmxCq0JhHOyuodFoncG7up7gDIdCH0gSrVztKx5LR4akGkBBgVTd O2BEEFpvz0cNbJQX0Z2UnQ8koGa8GPnsNwzj5Bgms6I5Kw2FBu4uV/zqP4/yWR6+R+ZR M+flXxR+hQP+lR918VaIlSlOCnTPzn9UDDc8Di9L4cDuFXYSyy4zeOfkDmPc7vKyXxd+ Nckw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=sWpa9lvpwKcGlvOMsTYGUtppmcEOL9ULWBsgRjkosaY=; b=nBQfgXIbVmInNWQhKDVK8/AwoRXbPcDXnA4BkYuM25qYaOid8N1owBRtR54RUzp0Od Q0w1Nn7IO4ZVegDo6m8WfC/rTbxn8vzmu4UEfajl40clOjvy+bPkFqEVvMEaTCltcOQT ixGWrHTOQT67BTtb7j/ng0QpJpINd+38dEq3mTKMyrkc0AgBsV+o6WVjZBH9LIbZvc8M Bmbu4dHI5KZ0xg7ihKXH1KFUjwOgvivIsahSpl9L7UWyfQhkcGOhXCkV7BR7hb/VjWIQ 9bGZpnt5lFLBeRqtrrceFJzbhy0Q62WuX45E1GXd3HSxwEOWko8iM6drqHXNZlqUqO62 8Vng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=k9vpN0js; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j20si16451130eje.346.2021.05.25.13.53.04; Tue, 25 May 2021 13:53:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=k9vpN0js; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232463AbhEYP62 (ORCPT + 99 others); Tue, 25 May 2021 11:58:28 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:14450 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S231918AbhEYP61 (ORCPT ); Tue, 25 May 2021 11:58:27 -0400 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 14PFXewf161867; Tue, 25 May 2021 11:56:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=sWpa9lvpwKcGlvOMsTYGUtppmcEOL9ULWBsgRjkosaY=; b=k9vpN0jsT4DZTvOB5IXRN/v4xK4CRdIkHmFhNs/bQ7e7TzDIcruLb5+rN1pHbaGg82u2 T3Dc0U4rNK5ajbmGkIAt6Kx01nDG1Jk0UojPzj02/RLkiSE1zNawNznvJULgOIX5eKjJ 6II4F0I3VjM4OynH/UXEG/XAZ03F77+yxaAGnw5vWu0ZuL+jYGkfqcJPS332gGO6GsMs 5xtxhwSha5hDJ+A9OAFmFJAPnyk/hbiTTXqw5IvBGHxJpE+wX30c1TU9kq1FStpBbWB7 LlBWKOMByYmSlgzjjr1+4r8tzyF9nrri0xb3cWEZ8sUsbUh8yCxkCMDVf4YEWPVaPdnh gg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 38s2xvbw3n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 25 May 2021 11:56:54 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 14PFXh8p162351; Tue, 25 May 2021 11:56:54 -0400 Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0b-001b2d01.pphosted.com with ESMTP id 38s2xvbw2w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 25 May 2021 11:56:54 -0400 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 14PFqKGq003471; Tue, 25 May 2021 15:56:53 GMT Received: from b03cxnp07027.gho.boulder.ibm.com (b03cxnp07027.gho.boulder.ibm.com [9.17.130.14]) by ppma02dal.us.ibm.com with ESMTP id 38s1hbhurg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 25 May 2021 15:56:53 +0000 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp07027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 14PFuqfS23265758 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 25 May 2021 15:56:52 GMT Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E90D4136065; Tue, 25 May 2021 15:56:51 +0000 (GMT) Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BB891136061; Tue, 25 May 2021 15:56:50 +0000 (GMT) Received: from cpe-172-100-179-72.stny.res.rr.com (unknown [9.85.177.219]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 25 May 2021 15:56:50 +0000 (GMT) Subject: Re: [PATCH v4 2/2] s390/vfio-ap: control access to PQAP(AQIC) interception handler To: Jason Gunthorpe Cc: jjherne@linux.ibm.com, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, borntraeger@de.ibm.com, cohuck@redhat.com, pasic@linux.vnet.ibm.com, alex.williamson@redhat.com, kwankhede@nvidia.com, frankja@linux.ibm.com, david@redhat.com, imbrenda@linux.ibm.com, hca@linux.ibm.com References: <20210521193648.940864-1-akrowiak@linux.ibm.com> <20210521193648.940864-3-akrowiak@linux.ibm.com> <5d15fdf2-aee8-4e6c-c3e1-f07c76ce5974@linux.ibm.com> <20210525131912.GW1002214@nvidia.com> From: Tony Krowiak Message-ID: Date: Tue, 25 May 2021 11:56:50 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: <20210525131912.GW1002214@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: iIWHI-nYv2T0RPAeA9u62p6lt1ykuN8u X-Proofpoint-GUID: TTqWzYM-uuNS8hmGKB0h1U9sxqDlH2fn X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391,18.0.761 definitions=2021-05-25_07:2021-05-25,2021-05-25 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 suspectscore=0 adultscore=0 mlxlogscore=999 clxscore=1015 spamscore=0 mlxscore=0 priorityscore=1501 phishscore=0 malwarescore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2105250094 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Why can't you put the locks in the right order? It looked trivial, I'm confused. > > Jason I explained this in one of my responses to the previous series. Maybe I didn't do a good job of it, so let me see if I can provide a more thorough explanation. The handle_pqap() function in priv.c does not have any access to the matrix_dev->lock which lives in the vfio_ap module, so there is no way for this function to control the order of locking. The only lock it can access is the lock provided for the hook function pointer which must be held for the duration of the execution of the hook. When the hook function (handle_pqap() in vfio_ap_ops.c) is called, it has to lock the matrix_dev->lock mutex to do its thing. The interception of the PQAP instruction that sets off the above scenario can happen simultaneously with both the vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm() instructions in vfio_ap_ops.c. The vfio_ap_mdev_set_kvm() function is called only by the group notifier callback when the vfio_ap driver is notified that the KVM pointer has been set. In this case, we could set the lock for the hook function before setting the matrix_dev->lock and calling the vfio_ap_mdev_set_kvm() function and all would be well. The vfio_ap_mdev_unset_kvm() function, however, is called both by the group notifier when the KVM pointer has been cleared or when the mdev is being removed. In both cases, the only way to get the KVM pointer - which is needed to unplug the AP resources from the guest - is from the matrix_mdev which contains it. This, of course, needs to be done while holding the matrix_dev->lock mutex. The vfio_ap_mdev_unset_kvm() function also clears the hook function pointer, but can only get the lock used to control access to it from the matrix_mdev; therein lies the rub. So we can have the following scenario which is flagged by lockdep: CPU x:                                            CPU y: --------                                             --------                                                       lock the matrix_dev->lock: vfio_ap_mdev_set_kvm in vfio_ap_ops.c lock the hook pointer: handle_pqap in priv.c lock the matrix_dev->lock handle_pqap in vfio_ap_ops.c                                                       lock the hook pointer: vfio_ap_mdev_set_kvm in vfio_ap_ops.c Maybe I'm missing something, but I was unable to find a way around this when the hook function pointer and its locking mechanism is stored in a field of a satellite structure of struct kvm.