Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753405AbcDRKu2 (ORCPT ); Mon, 18 Apr 2016 06:50:28 -0400 Received: from mail.kernel.org ([198.145.29.136]:35106 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752244AbcDRKuY (ORCPT ); Mon, 18 Apr 2016 06:50:24 -0400 From: lizf@kernel.org To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Joseph Qi , Joel Becker , Mark Fasheh , "Junxiao Bi" , Andrew Morton , Linus Torvalds , Zefan Li Subject: [PATCH 3.4 61/92] ocfs2/dlm: fix deadlock when dispatch assert master Date: Mon, 18 Apr 2016 18:46:06 +0800 Message-Id: <1460976397-5688-61-git-send-email-lizf@kernel.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1460976338-5631-1-git-send-email-lizf@kernel.org> References: <1460976338-5631-1-git-send-email-lizf@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3076 Lines: 95 From: Joseph Qi 3.4.112-rc1 review patch. If anyone has any objections, please let me know. ------------------ commit 012572d4fc2e4ddd5c8ec8614d51414ec6cae02a upstream. The order of the following three spinlocks should be: dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock But dlm_dispatch_assert_master() is called while holding dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls dlm_grab() which will take dlm_domain_lock. Once another thread (for example, dlm_query_join_handler) has already taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock happens. Signed-off-by: Joseph Qi Cc: Joel Becker Cc: Mark Fasheh Cc: "Junxiao Bi" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [lizf: Backported to 3.4: adjust context] Signed-off-by: Zefan Li --- fs/ocfs2/dlm/dlmmaster.c | 4 +++- fs/ocfs2/dlm/dlmrecovery.c | 6 +++++- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 7ba6ac1..751efa8 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -1411,6 +1411,7 @@ int dlm_master_request_handler(struct o2net_msg *msg, u32 len, void *data, int found, ret; int set_maybe; int dispatch_assert = 0; + int dispatched = 0; if (!dlm_grab(dlm)) return DLM_MASTER_RESP_NO; @@ -1617,6 +1618,8 @@ send_response: mlog(ML_ERROR, "failed to dispatch assert master work\n"); response = DLM_MASTER_RESP_ERROR; dlm_lockres_put(res); + } else { + dispatched = 1; } } else { if (res) @@ -2041,7 +2044,6 @@ int dlm_dispatch_assert_master(struct dlm_ctxt *dlm, /* queue up work for dlm_assert_master_worker */ - dlm_grab(dlm); /* get an extra ref for the work item */ dlm_init_work_item(dlm, item, dlm_assert_master_worker, NULL); item->u.am.lockres = res; /* already have a ref */ /* can optionally ignore node numbers higher than this node */ diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index d15b071..0e5013e 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -1689,6 +1689,7 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data, unsigned int hash; int master = DLM_LOCK_RES_OWNER_UNKNOWN; u32 flags = DLM_ASSERT_MASTER_REQUERY; + int dispatched = 0; if (!dlm_grab(dlm)) { /* since the domain has gone away on this @@ -1710,6 +1711,8 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data, mlog_errno(-ENOMEM); /* retry!? */ BUG(); + } else { + dispatched = 1; } } else /* put.. incase we are not the master */ dlm_lockres_put(res); @@ -1717,7 +1720,8 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data, } spin_unlock(&dlm->spinlock); - dlm_put(dlm); + if (!dispatched) + dlm_put(dlm); return master; } -- 1.9.1