Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756010AbbEUPdq (ORCPT ); Thu, 21 May 2015 11:33:46 -0400 Received: from mga11.intel.com ([192.55.52.93]:46894 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755862AbbEUPdn convert rfc822-to-8bit (ORCPT ); Thu, 21 May 2015 11:33:43 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,469,1427785200"; d="scan'208";a="496600034" From: "Drokin, Oleg" To: Dan Carpenter , "nikitas_angelinas@xyratex.com" , "Zhen, Liang" CC: AdrianRemonda , "open list:STAGING SUBSYSTEM" , "Dilger, Andreas" , Julia Lawall , Greg Kroah-Hartman , open list , Greg Donald , "moderated list:STAGING - LUSTRE..." , Joe Perches Subject: Re: [PATCH 4/4] Staging: lustre: sparse lock warning fix Thread-Topic: [PATCH 4/4] Staging: lustre: sparse lock warning fix Thread-Index: AQHQkZlsI0pUIZUiZ0+MQLtiJXgVVp2Cs0qAgAPbegCAAHRjAIAABf+A Date: Thu, 21 May 2015 15:33:34 +0000 Message-ID: References: <1431974091-26363-1-git-send-email-adrianremonda@gmail.com> <1431974091-26363-2-git-send-email-adrianremonda@gmail.com> <1431974091-26363-3-git-send-email-adrianremonda@gmail.com> <1431974091-26363-4-git-send-email-adrianremonda@gmail.com> <1431974091-26363-5-git-send-email-adrianremonda@gmail.com> <20150518212115.GN14154@mwanda> <20150521081534.GA13975@debian> <20150521151208.GI4150@mwanda> In-Reply-To: <20150521151208.GI4150@mwanda> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.254.81.105] Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1985 Lines: 39 On May 21, 2015, at 11:12 AM, Dan Carpenter wrote: > Oh, sorry, I didn't read your patch very carefully. It won't cause a > deadlock. But I'm going to assume it's still not right until lustre > expert Acks it. I just took a closer look and it appears original code is buggy and the patch just propagates the bugginess. If we look at the nrs_policy_put_locked, it eventually ends up in nrs_policy_stop0, it would hold a lock on whatever happened to be the first policy in the array not NULL. But nrs_policy_stop0 would unlock the lock on the policy it was called on (already likely a deadlock material) and then relock it. The problems would arise only if there are more than one nrs policy registered which is theoretically possible, but certainly makes no sense a client (besides, none of the advanced NRS policies made it in anyway and I almost feel like they just add unnecessary complication in client-only code). The code looks elaborate enough as if the first policy lock is to be always used as the guardian lock, but then stop0 behavior might be a bug then? Or it's possible we never end up in stop0 due to nrs state machine? Let's see what Nikitas and Liang remember about any of this (one of them is the original author of this code, but I am not sure who.) Nikitas, Liang: The code in question is in nrs_resource_put_safe: for (i = 0; i < NRS_RES_MAX; i++) { if (pols[i] == NULL) continue; if (nrs == NULL) { nrs = pols[i]->pol_nrs; spin_lock(&nrs->nrs_lock); } nrs_policy_put_locked(pols[i]); } if (nrs != NULL) spin_unlock(&nrs->nrs_lock); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/