Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp2472419pxb; Mon, 11 Jan 2021 10:26:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJw1UIuPupSTZDn/co14PfEup9+bZIgl4excBUmIA1kFzU0NmIVTUskaj/Ozfoqj2R92bnyv X-Received: by 2002:a50:f404:: with SMTP id r4mr526862edm.62.1610389573698; Mon, 11 Jan 2021 10:26:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610389573; cv=none; d=google.com; s=arc-20160816; b=rrZbGZ+cjzATGBk2r+WIj4bV6U2vOSeEhUZwhA7UUi7Rb74Gg2i4i9RjzorSNOdOox iEpjAa820ByPHN4mT6eizAXbfz6oq2RsmCa8CjTTJhv9umHHrKOrdv2ROLwDh863XeKS 2ihpEa85jocv25RAE8C3iA6DQrpofnY4FaUH+tA5h+CuokYgqQjhCTOE+teoH+HCqlAK GzDHX8RM5erGneuP36VUJ+Oqb+ZqeTTM8YhMVGjqIwh7+DXanmCWV+WP30SPd+UKJaLD PmO3MrFogPSNYgtz+HjxyYGSkVvgsAqKOkTswHoJWMdAABcnty4lO9qCOOXzoFcO9ExA XW5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=mzN5bz6gja/X5dov9pXQmFrPtsmh4ja4sEQP9VQaaGg=; b=WeMgz+gGzoJbFIQPccdy6gZfpruEKHxbiss//Ex3uVl6VPkEdOlZHJvfm5lwZdVyOy MukweNcAk3sw5k4BPNxeB4JbKs+d6hyxt1IFWDPsdDTE9uZJgtsW8KNzeaAuAnHzd7ar Lu1jmo9PGVIOuoeREh7BlFsaV5nqgxH+rGRwOvONH1HhIMUFf324m1SD/P50mjXHvHUr t9lJQQfkTRBJnYbjVWh6OzevtkPNKjNOOgu8K9dXjQNkg6etTje5qJAln5NXvxT1LLoG fC2JJRvmJPxGqbBLe4hStlnjbVddPJo4aCKysF177J7RxIh+efcGRAFNXtNoZg8y8lNy tWXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=alUB7W1U; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id by23si97104ejb.165.2021.01.11.10.25.49; Mon, 11 Jan 2021 10:26:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=alUB7W1U; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389731AbhAKSXi (ORCPT + 99 others); Mon, 11 Jan 2021 13:23:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726244AbhAKSXi (ORCPT ); Mon, 11 Jan 2021 13:23:38 -0500 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CF01C061786; Mon, 11 Jan 2021 10:22:58 -0800 (PST) Received: by mail-ej1-x635.google.com with SMTP id w1so960339ejf.11; Mon, 11 Jan 2021 10:22:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=mzN5bz6gja/X5dov9pXQmFrPtsmh4ja4sEQP9VQaaGg=; b=alUB7W1UuzhBjLoGcICmod8MSAAajWbpmzGh9T1xUn34JUoLup1I+GSVAm9WzexkbT jbk/NKXpHhz9uv9L42ah4gSrf45f4BoesCxCQ4RkGYKRgYnbg6ajQMxwWXkCQ7ZakTCy QLIbOrCnOb4FX1q0BfIVqTR1h4mQ5P7PNGJ+WUb0DCSpwnjF2fBQjUucDnxEZBUg+j+U W8WWxcO0HYdp1Cxd3ICCpdSnt70MTlRxqF0IWPCExjlxCT6IaaOOpTdtF9cAoKj2+P73 vS3G/hlIpBS1uJaV1+QqLlNZuuPauJIBdtdD/nwaw1A7ORyONo/z13tXG23Si7seB8ZP 5aTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=mzN5bz6gja/X5dov9pXQmFrPtsmh4ja4sEQP9VQaaGg=; b=OOpK1ksfXowoTmvSRAaEFDDDvLBNX46knRK0+VMRKrIC0ObMcGlUzTA4eLCDLd8P3J qvuWvJYozjyDWurfxyk+chzZ2M3+nQYDcZ6GomUdMnYqMnq2H8zapCw4B8JYShQxqerW wlv99gJD6OX+9+Y53RHICdOTZM7PXPjjmIZ6gTEBEBk5bKFJRhQSwVNGVmeogdBiXfCv o0vPd3ROGb6v6FCxwXl1FFKphoSqKkXCrdY4QJqw7ylnpmQVYRnIcvETCbFUgAJJAzCD W7yDFeS7vHl1jPM5/ihZ1rmnvYIW21gtJ4xoDbsoWsiNqBJYi40fElx8gRdc3UeE7Rbc SHGA== X-Gm-Message-State: AOAM531G/kwG7ueWcuBCci0zyidfYQa7/D1Yn5MqJltU+VJcWzL0Dmcf ZLu6q9umjXW7vbsT15qzdgAURu42Aak= X-Received: by 2002:a17:906:4d59:: with SMTP id b25mr559961ejv.8.1610389376719; Mon, 11 Jan 2021 10:22:56 -0800 (PST) Received: from [192.168.178.40] (ipbcc05d1b.dynamic.kabel-deutschland.de. [188.192.93.27]) by smtp.gmail.com with ESMTPSA id b19sm298460edx.47.2021.01.11.10.22.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 11 Jan 2021 10:22:56 -0800 (PST) Subject: Re: [PATCH] scsi: target: tcmu: Fix wrong uio handling causing big memory leak To: linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, linux-kernel@vger.kernel.org, "Martin K. Petersen" Cc: Greg Kroah-Hartman , Mike Christie , Bodo Stroesser References: <20201218141534.9918-1-bostroesser@gmail.com> From: Bodo Stroesser Message-ID: <703e98c0-31e6-7d3d-c818-d1f071eb552c@gmail.com> Date: Mon, 11 Jan 2021 19:22:54 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20201218141534.9918-1-bostroesser@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I'm sending a gentle ping for the below patch. Especially I'm wondering whether the way how tcmu_destroy_device() now 'stops' the uio device is ok. Should we better have something like uio_stop_device() instead? Thank you, Bodo On 18.12.20 15:15, Bodo Stroesser wrote: > tcmu calls uio_unregister_device from tcmu_destroy_device. > After that uio will never call tcmu_release for this device. > If userspace still had the uio device open and / or mmap'ed > during uio_unregister_device, tcmu_release will not be called and > udev->kref will never go down to 0. > > So tcmu in this situation will not free: > - cmds or tmrs still in the queue or the ring > - all pages allocated for mailbox and cmd_ring (vmalloc) > - all pages allocated for data space > - the udev itself > > The vmalloc'ed area is 8 MB, amount of pages allocated for data > space depends on previous usage of the tcmu device. Theoretically > that can be up to 1GB. > > This patch moves the call of uio_unregister_device to the > beginning of tcmu_dev_kref_release, which is called when > udev->kref drops down to zero. So we know, that uio device is > closed and unmap'ed. > > In case tcmu_realease drops the last kref, we would end up doing > the uio_unregister_device from a function called by uio_release, > which causes the process to block forever. > So we now do the kref_put from new worker function > tcmu_release_work_fn which is scheduled by tcmu_release. > > To make userspace still aware of the device being deleted, > tcmu_destroy_device instead of uio_unregister_device now does: > - sets a bit in udev, so that tcmu_open and tcmu_mmap can check > and fail with -EIO > - resets udev->uio_info->irq to 0, so uio will fail read() and > write() with -EIO > - wakes up userspace possibly waiting in read(), so the read > fails with -EIO > > Avoid possible races in tcmu_open by replacing kref_get with > kref_get_unless_zero. > > Signed-off-by: Bodo Stroesser > --- > drivers/target/target_core_user.c | 54 ++++++++++++++++++++++++++++++++++++--- > 1 file changed, 50 insertions(+), 4 deletions(-) > > diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c > index 0458bfb143f8..080760985ebf 100644 > --- a/drivers/target/target_core_user.c > +++ b/drivers/target/target_core_user.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -109,6 +110,7 @@ struct tcmu_nl_cmd { > struct tcmu_dev { > struct list_head node; > struct kref kref; > + struct work_struct release_work; > > struct se_device se_dev; > > @@ -119,6 +121,7 @@ struct tcmu_dev { > #define TCMU_DEV_BIT_BROKEN 1 > #define TCMU_DEV_BIT_BLOCKED 2 > #define TCMU_DEV_BIT_TMR_NOTIFY 3 > +#define TCMU_DEV_BIT_GOING_DOWN 4 > unsigned long flags; > > struct uio_info uio_info; > @@ -1527,6 +1530,8 @@ static void tcmu_detach_hba(struct se_hba *hba) > hba->hba_ptr = NULL; > } > > +static void tcmu_release_work_fn(struct work_struct *work); > + > static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name) > { > struct tcmu_dev *udev; > @@ -1542,6 +1547,8 @@ static struct se_device *tcmu_alloc_device(struct se_hba *hba, const char *name) > return NULL; > } > > + INIT_WORK(&udev->release_work, tcmu_release_work_fn); > + > udev->hba = hba; > udev->cmd_time_out = TCMU_TIME_OUT; > udev->qfull_time_out = -1; > @@ -1719,6 +1726,9 @@ static int tcmu_mmap(struct uio_info *info, struct vm_area_struct *vma) > { > struct tcmu_dev *udev = container_of(info, struct tcmu_dev, uio_info); > > + if (test_bit(TCMU_DEV_BIT_GOING_DOWN, &udev->flags)) > + return -EIO; > + > vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > vma->vm_ops = &tcmu_vm_ops; > > @@ -1735,12 +1745,17 @@ static int tcmu_open(struct uio_info *info, struct inode *inode) > { > struct tcmu_dev *udev = container_of(info, struct tcmu_dev, uio_info); > > + if (test_bit(TCMU_DEV_BIT_GOING_DOWN, &udev->flags)) > + return -EIO; > + > /* O_EXCL not supported for char devs, so fake it? */ > if (test_and_set_bit(TCMU_DEV_BIT_OPEN, &udev->flags)) > return -EBUSY; > > udev->inode = inode; > - kref_get(&udev->kref); > + if (!kref_get_unless_zero(&udev->kref)) > + /* Race between open and device going down */ > + return -EIO; > > pr_debug("open\n"); > > @@ -1799,6 +1814,8 @@ static void tcmu_dev_kref_release(struct kref *kref) > bool all_expired = true; > int i; > > + uio_unregister_device(&udev->uio_info); > + > vfree(udev->mb_addr); > udev->mb_addr = NULL; > > @@ -1827,6 +1844,15 @@ static void tcmu_dev_kref_release(struct kref *kref) > call_rcu(&dev->rcu_head, tcmu_dev_call_rcu); > } > > +static void tcmu_release_work_fn(struct work_struct *work) > +{ > + struct tcmu_dev *udev = container_of(work, struct tcmu_dev, > + release_work); > + > + /* release ref from open */ > + kref_put(&udev->kref, tcmu_dev_kref_release); > +} > + > static int tcmu_release(struct uio_info *info, struct inode *inode) > { > struct tcmu_dev *udev = container_of(info, struct tcmu_dev, uio_info); > @@ -1834,8 +1860,17 @@ static int tcmu_release(struct uio_info *info, struct inode *inode) > clear_bit(TCMU_DEV_BIT_OPEN, &udev->flags); > > pr_debug("close\n"); > - /* release ref from open */ > - kref_put(&udev->kref, tcmu_dev_kref_release); > + > + /* > + * We must not put kref directly from here, since dropping down kref to > + * zero would implicitly call tcmu_dev_kref_release, which calls > + * uio_unregister_device --> process hangs forever, since tcmu_release > + * is called from uio. > + * So we leave it to tcmu_release_work_fn to put the kref. > + */ > + while (!schedule_work(&udev->release_work)) > + usleep_range(1000, 5000); > + > return 0; > } > > @@ -2166,7 +2201,18 @@ static void tcmu_destroy_device(struct se_device *dev) > > tcmu_send_dev_remove_event(udev); > > - uio_unregister_device(&udev->uio_info); > + /* > + * We must not call uio_unregister_device here. If there is a userspace > + * process with open or mmap'ed uio device, uio would not call > + * tcmu_release on later unmap or close. > + */ > + > + /* reset uio_info->irq, so uio will reject read() and write() */ > + udev->uio_info.irq = 0; > + /* Set bit, so we can reject later calls to tcmu_open and tcmu_mmap */ > + set_bit(TCMU_DEV_BIT_GOING_DOWN, &udev->flags); > + /* wake up possible sleeper in uio_read(), it will return -EIO */ > + uio_event_notify(&udev->uio_info); > > /* release ref from configure */ > kref_put(&udev->kref, tcmu_dev_kref_release); >