Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp6424213rdb; Tue, 2 Jan 2024 00:58:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IES3qv1mVf0zUZk/HWnrFnyjraecacGZajNn7sdShiQgbyyy8cZbD4DLvX5hXifKnXtSil7 X-Received: by 2002:a05:6a00:4f94:b0:6da:e0a:9b19 with SMTP id ld20-20020a056a004f9400b006da0e0a9b19mr9082074pfb.63.1704185910274; Tue, 02 Jan 2024 00:58:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704185910; cv=none; d=google.com; s=arc-20160816; b=zm0SGbYCbyVf/PufYQEVuLc/h/nbmqVuVV77uERskI4PquG+aJChVXrMWh4hKUEYOK arKb/pQTK8Mp20LnzfaHrS8ldj1fRF/RliSFe4ey8M3d8lkvazNQ5Pp47qy0Sk3lHJaE x+xBI06ydz9qZWSpY3lCT5TcSohQi1YBB2yF4NwgUiqw1uQNUYa1CQuLAjf1dfSrThsx hXWy9Tnmv5FRzhTWawYbXB73ivYA7g36ZCF4HVSZkSKci6FssQZnMe5IonzQIMptd0RH l0q3sTwxuqxHWB7UxwvirRHpkDCRhx1zZFm9f0j+jNGDNerwzNrTArqN0TSXQNk3fRWk NYkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=XREbKMh67LtssMZ7E54NnvgILuHICx2uo4q2pb/cxic=; fh=1Ok2/qZCUkvRH96WPX77rzqDiZpFN6SVsDrsFjCNXdM=; b=GmMygg891YWNkI4cPz7uOjXlO0TdLaLBj2gaXQpPFbl8jhOmSqbYRZK1MTTnD00nrF ILMcnQqUec1DF6zWH7+upmPdQqk7lQOEAzNm6gqu6nC9JB7iPRK5il4fn68+b5frZHRW p/ttMRc2J33XJUMVwTEZt4beHP+P2SpA2kGMFUhXubBc3Ryn4mH0BGm9+EaHmbmUSrZ6 lFvDVvPr0tZe+YKXNyxWrIHX1GmnxsQOa1pi6+tveg4/KpeeRYbGI0tN6ksnVc6zOUfi OftichbjLR6Rzb+xXxX3yR2ksVNg082y2bXKRf7nAtOKs9U26ZKTSL1ayg6rlK7Q6N+X clMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=LG6FTr8a; spf=pass (google.com: domain of linux-kernel+bounces-14165-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-14165-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id lo13-20020a056a003d0d00b006da4aa41518si6180685pfb.188.2024.01.02.00.58.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jan 2024 00:58:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-14165-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=LG6FTr8a; spf=pass (google.com: domain of linux-kernel+bounces-14165-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-14165-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 2BD0AB21764 for ; Tue, 2 Jan 2024 08:58:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BD931611E; Tue, 2 Jan 2024 08:58:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LG6FTr8a" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF23353BA; Tue, 2 Jan 2024 08:58:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 46848C433C7; Tue, 2 Jan 2024 08:58:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704185898; bh=mgDLuKTiRK0fcEwuAINoY+MKZtrYC/YrQ/tRrnrWiO8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=LG6FTr8agk7bZXldCG9mlpFOSBl5YUPabX2gi451OhzE9OEPLQJ9B1Y++z+iM51IA cDGbkE3t9rJyRAWCzmqWKgp+9Fcvk3BImfEYeJEpfQmo3gdVPcxbmf59SaLUiwmrYA EFCEjeKHc9noMHC/5C43Ojp453A9Mu6Z6Ad49OmSI9/iuWu3aPv8heYQfCbmt31+Bf sUzgv1DW4ttdo77EEF1dcguVjFsogjtEfnu4uoK6oQ251l5c+dXfuhFfPOoQuILbEP wPMEzNGMaXpTQ/VrMGSPMoxmMHPZUTXp1jXV0IyGxyJSd76ktt/VrUzdEWdnOWw6X/ GkgmgRfWxQ6Vg== Date: Tue, 2 Jan 2024 10:58:14 +0200 From: Leon Romanovsky To: Shifeng Li Cc: jgg@ziepe.ca, wenglianfa@huawei.com, gustavoars@kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, Shifeng Li Subject: Re: [PATCH] RDMA/device: Fix a race between mad_client and cm_client init Message-ID: <20240102085814.GD6361@unreal> References: <20240102034335.34842-1-lishifeng@sangfor.com.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240102034335.34842-1-lishifeng@sangfor.com.cn> On Mon, Jan 01, 2024 at 07:43:35PM -0800, Shifeng Li wrote: > The mad_client will be initialized in enable_device_and_get(), while the > devices_rwsem will be downgraded to a read semaphore. There is a window > that leads to the failed initialization for cm_client, since it can not > get matched mad port from ib_mad_port_list, and the matched mad port will > be added to the list after that. > > mad_client | cm_client > ------------------|-------------------------------------------------------- > ib_register_device| > enable_device_and_get > down_write(&devices_rwsem) > xa_set_mark(&devices, DEVICE_REGISTERED) > downgrade_write(&devices_rwsem) > | > |ib_cm_init > |ib_register_client(&cm_client) > |down_read(&devices_rwsem) > |xa_for_each_marked (&devices, DEVICE_REGISTERED) > |add_client_context > |cm_add_one > |ib_register_mad_agent > |ib_get_mad_port > |__ib_get_mad_port > |list_for_each_entry(entry, &ib_mad_port_list, port_list) > |return NULL > |up_read(&devices_rwsem) > | > add_client_context| > ib_mad_init_device| > ib_mad_port_open | > list_add_tail(&port_priv->port_list, &ib_mad_port_list) > up_read(&devices_rwsem) > | How is this stack possible? ib_register_device() is called by drivers and happens much later than ib_cm_init(). Thanks > > Fix it by using the devices_rwsem write semaphore to protect the mad_client > init flow in enable_device_and_get(). > > Fixes: d0899892edd0 ("RDMA/device: Provide APIs from the core code to help unregistration") > Cc: Shifeng Li > Signed-off-by: Shifeng Li > --- > drivers/infiniband/core/device.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c > index 67bcea7a153c..85782786993d 100644 > --- a/drivers/infiniband/core/device.c > +++ b/drivers/infiniband/core/device.c > @@ -1315,12 +1315,6 @@ static int enable_device_and_get(struct ib_device *device) > down_write(&devices_rwsem); > xa_set_mark(&devices, device->index, DEVICE_REGISTERED); > > - /* > - * By using downgrade_write() we ensure that no other thread can clear > - * DEVICE_REGISTERED while we are completing the client setup. > - */ > - downgrade_write(&devices_rwsem); > - > if (device->ops.enable_driver) { > ret = device->ops.enable_driver(device); > if (ret) > @@ -1337,7 +1331,7 @@ static int enable_device_and_get(struct ib_device *device) > if (!ret) > ret = add_compat_devs(device); > out: > - up_read(&devices_rwsem); > + up_write(&devices_rwsem); > return ret; > } > > -- > 2.25.1 > >