Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1022596ybe; Wed, 4 Sep 2019 11:16:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqxsjkvqMKj5kpxhrTezQm9lJBcyr/YfUqZS8YR6U70NJvI9lgtBGhdSs0Dp69D2Uttj8b6+ X-Received: by 2002:a17:902:44c:: with SMTP id 70mr41219936ple.225.1567621015970; Wed, 04 Sep 2019 11:16:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567621015; cv=none; d=google.com; s=arc-20160816; b=LnmUmdTuRhB5HdLzUN0C4OXKZE2iQ35AMKgxAXQe7HlqrExWnPEVklcNc6IYdsjszh xlhN56TJsYRbWRXSBfy9ubTlRv5l8Bo/wp02paHR4mshDXDt9prrKL48Z7a+BX1WWSZy WYmGox4/YpwacSSNTI8jHorB2cnL6E9e5bm08U0LmC4pG+Dlkw8CiCujIvCQchAt6QZf W54B4WhlRL63YXecqT4o+40FgCJePkwo5Vera9YvhCUX8hb5BqNGzvgkltvYaHTKNyVQ NEDyg1Jfo0Yq1valSsGoKn0/j7VmWZQ8Lt7vaEBe8m+mo7OZslRUGMtHiV9eIMxqjw0g g3LQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=yPM/yocZpzdO/ft0C0pQmJgHXo7DG4heBGHuTjW2R9c=; b=cqP5o3x+JQjdQ+uS+bSfq9NJ8pkU/NBgU4D8bswiEJJSjVqfUU0cA9r1mOF+fWQcGD +sPvgAw8WqQDXZTKUcKFPRqO6xT/Vh0P2NcnqoF4P4nTw5opHnGVA45Gnb0nsvCeX4Nh HWguw874jtUyBj2quREntqUJ1G741GVilcHCNCOkQneOzQZBgwqk5ebMQqRy4vJl4G+7 jJMKz3qfkABdggYY+ptnZJ9q155aiQO2kW5MzHcAuSGxsmmhPMSLzIpSNMhX7LZGzI2Y tir7WF6EuvavsJrCjAF9X18Dqbfx/oQeKdoYNSHHT2X9RxlyfZ/OBTigvPrCj21WdAKR BqvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=bjF5v434; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bx22si2909963pjb.99.2019.09.04.11.16.40; Wed, 04 Sep 2019 11:16:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=bjF5v434; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390731AbfIDSOF (ORCPT + 99 others); Wed, 4 Sep 2019 14:14:05 -0400 Received: from mail.kernel.org ([198.145.29.99]:59106 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390723AbfIDSOE (ORCPT ); Wed, 4 Sep 2019 14:14:04 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 695F32377D; Wed, 4 Sep 2019 18:14:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1567620843; bh=q/Q98mTmjJE0fgH0yLTvh1vWm8492fvE23onap9JWA0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bjF5v434YHaZss1GbfE2uizst1S6r6KIZunql9XP0reFF+WtDkxYZy1+cxerVGBQo 9IMYLyWII6Bl+X8iiavFiK06qGtZVPx5WD6VLCZdEA7rn0v3Ok4cx8OXMIK0RclzlR gh3osh27KigMjS1KKFNewq8ANiPqnhXmafcNI7MQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Francois Rigault , Jorgen Hansen , Adit Ranadive , Alexios Zavras , Vishnu DASA , Nadav Amit Subject: [PATCH 5.2 117/143] VMCI: Release resource if the work is already queued Date: Wed, 4 Sep 2019 19:54:20 +0200 Message-Id: <20190904175318.977217542@linuxfoundation.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20190904175314.206239922@linuxfoundation.org> References: <20190904175314.206239922@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Nadav Amit commit ba03a9bbd17b149c373c0ea44017f35fc2cd0f28 upstream. Francois reported that VMware balloon gets stuck after a balloon reset, when the VMCI doorbell is removed. A similar error can occur when the balloon driver is removed with the following splat: [ 1088.622000] INFO: task modprobe:3565 blocked for more than 120 seconds. [ 1088.622035] Tainted: G W 5.2.0 #4 [ 1088.622087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1088.622205] modprobe D 0 3565 1450 0x00000000 [ 1088.622210] Call Trace: [ 1088.622246] __schedule+0x2a8/0x690 [ 1088.622248] schedule+0x2d/0x90 [ 1088.622250] schedule_timeout+0x1d3/0x2f0 [ 1088.622252] wait_for_completion+0xba/0x140 [ 1088.622320] ? wake_up_q+0x80/0x80 [ 1088.622370] vmci_resource_remove+0xb9/0xc0 [vmw_vmci] [ 1088.622373] vmci_doorbell_destroy+0x9e/0xd0 [vmw_vmci] [ 1088.622379] vmballoon_vmci_cleanup+0x6e/0xf0 [vmw_balloon] [ 1088.622381] vmballoon_exit+0x18/0xcc8 [vmw_balloon] [ 1088.622394] __x64_sys_delete_module+0x146/0x280 [ 1088.622408] do_syscall_64+0x5a/0x130 [ 1088.622410] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1088.622415] RIP: 0033:0x7f54f62791b7 [ 1088.622421] Code: Bad RIP value. [ 1088.622421] RSP: 002b:00007fff2a949008 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 1088.622426] RAX: ffffffffffffffda RBX: 000055dff8b55d00 RCX: 00007f54f62791b7 [ 1088.622426] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055dff8b55d68 [ 1088.622427] RBP: 000055dff8b55d00 R08: 00007fff2a947fb1 R09: 0000000000000000 [ 1088.622427] R10: 00007f54f62f5cc0 R11: 0000000000000206 R12: 000055dff8b55d68 [ 1088.622428] R13: 0000000000000001 R14: 000055dff8b55d68 R15: 00007fff2a94a3f0 The cause for the bug is that when the "delayed" doorbell is invoked, it takes a reference on the doorbell entry and schedules work that is supposed to run the appropriate code and drop the doorbell entry reference. The code ignores the fact that if the work is already queued, it will not be scheduled to run one more time. As a result one of the references would not be dropped. When the code waits for the reference to get to zero, during balloon reset or module removal, it gets stuck. Fix it. Drop the reference if schedule_work() indicates that the work is already queued. Note that this bug got more apparent (or apparent at all) due to commit ce664331b248 ("vmw_balloon: VMCI_DOORBELL_SET does not check status"). Fixes: 83e2ec765be03 ("VMCI: doorbell implementation.") Reported-by: Francois Rigault Cc: Jorgen Hansen Cc: Adit Ranadive Cc: Alexios Zavras Cc: Vishnu DASA Cc: stable@vger.kernel.org Signed-off-by: Nadav Amit Reviewed-by: Vishnu Dasa Link: https://lore.kernel.org/r/20190820202638.49003-1-namit@vmware.com Signed-off-by: Greg Kroah-Hartman --- drivers/misc/vmw_vmci/vmci_doorbell.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/drivers/misc/vmw_vmci/vmci_doorbell.c +++ b/drivers/misc/vmw_vmci/vmci_doorbell.c @@ -310,7 +310,8 @@ int vmci_dbell_host_context_notify(u32 s entry = container_of(resource, struct dbell_entry, resource); if (entry->run_delayed) { - schedule_work(&entry->work); + if (!schedule_work(&entry->work)) + vmci_resource_put(resource); } else { entry->notify_cb(entry->client_data); vmci_resource_put(resource); @@ -361,7 +362,8 @@ static void dbell_fire_entries(u32 notif atomic_read(&dbell->active) == 1) { if (dbell->run_delayed) { vmci_resource_get(&dbell->resource); - schedule_work(&dbell->work); + if (!schedule_work(&dbell->work)) + vmci_resource_put(&dbell->resource); } else { dbell->notify_cb(dbell->client_data); }