Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933374AbbGURoT (ORCPT ); Tue, 21 Jul 2015 13:44:19 -0400 Received: from e06smtp10.uk.ibm.com ([195.75.94.106]:38484 "EHLO e06smtp10.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755728AbbGURoS (ORCPT ); Tue, 21 Jul 2015 13:44:18 -0400 X-Helo: d06dlp03.portsmouth.uk.ibm.com X-MailFrom: gerald.schaefer@de.ibm.com X-RcptTo: linux-kernel@vger.kernel.org From: Gerald Schaefer To: Alex Williamson Cc: , linux-kernel@vger.kernel.org, Martin Schwidefsky , Sebastian Ott , Gerald Schaefer Subject: [RFC PATCH 0/1] vfio-pci/iommu: Detach iommu group on remove path Date: Tue, 21 Jul 2015 19:44:05 +0200 Message-Id: <1437500646-18031-1-git-send-email-gerald.schaefer@de.ibm.com> X-Mailer: git-send-email 2.3.8 X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15072117-0041-0000-0000-000004FF753A Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2753 Lines: 89 Hi, during IOMMU API function testing on s390 I hit the following scenario: After binding a device to vfio-pci, the user completes the VFIO_SET_IOMMU ioctl and stops, see the sample C program below. Now the device is manually removed via "echo 1 > /sys/bus/pci/devices/.../remove", which completes instantly because the device is not considered in use in vfio_del_group_dev() and ops->request will be skipped (probably because there was no VFIO_GROUP_GET_DEVICE_FD ioctl so far, only the SET_IOMMU which only triggered an "attach iommu group"). Although the SET_IOMMU ioctl triggered the attach_dev callback in the underlying IOMMU API, removing the device in this way won't trigger the detach_dev callback, neither during remove nor when the user program continues with closing group/container. On s390 this eventually leads to a kernel panic when binding the device again to its non-vfio PCI driver, because of the missing arch-specific cleanup in detach_dev. On x86 I couldn't trigger the panic but I could verify that detach_dev also won't get called in this scenario, which probably means at least some kind of memory leak there. I think I found a way to fix this in vfio code by calling vfio_group_try_dissolve_container() from within vfio_del_group_dev(), but I'm not really familiar with this code and so there may be better ways to fix it. Any thoughts? Regards, Gerald Here is the sample C program to trigger the ioctl: #include #include #include int main(void) { int container, group, rc; container = open("/dev/vfio/vfio", O_RDWR); if (container < 0) { perror("open /dev/vfio/vfio\n"); return -1; } group = open("/dev/vfio/0", O_RDWR); if (group < 0) { perror("open /dev/vfio/0\n"); return -1; } rc = ioctl(group, VFIO_GROUP_SET_CONTAINER, &container); if (rc) { perror("ioctl VFIO_GROUP_SET_CONTAINER\n"); return -1; } rc = ioctl(container, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU); if (rc) { perror("ioctl VFIO_SET_IOMMU\n"); return -1; } printf("Try device remove...\n"); getchar(); close(group); close(container); return 0; } Gerald Schaefer (1): vfio-pci/iommu: Detach iommu group on remove path drivers/vfio/vfio.c | 3 +++ 1 file changed, 3 insertions(+) -- 2.3.8 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/