Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp197059rwd; Mon, 15 May 2023 23:01:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7ECJSAvYXq5VEQ46BZLXSd4YbH0BXVWw5BYgfxdqJWdjtWt7kbsVh/mjmGQZaYcmXw5GO/ X-Received: by 2002:a17:902:d491:b0:1ab:26a8:5401 with SMTP id c17-20020a170902d49100b001ab26a85401mr52665793plg.31.1684216872916; Mon, 15 May 2023 23:01:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684216872; cv=none; d=google.com; s=arc-20160816; b=x7UPKBM32gPpuyaQ0hsvKkQMFicr0zjdYmGKzG71xxIPZS49D+TnyVC37e3MxefC3+ 4aGCgxOCNjhBWxsPpmQlYRgwr7ND0j1TQvIVOuUqcDiReoGU32D4G9Np60PSE2IuNwZ3 1Swf8rPkuLdIZ5UhnaQ+yt6FuzPfdkO0rZD4RZB9oR/BcBu9mLmMgv0DnM5J3+vY1A7H EisCp8YzusOIJhL4eIemWvJv+TrNJaPpw0MzryPHBgSwyrFUzcivFwRd9iExXlJAt0Ju cfWhgLHtaPXOiBUeLyeFq2Fn2e79I0Y3BwfqkwLFHdab7gOREIKToyRqrbjBhXPt67yp ApQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=2XJHaG4c+GPnkNk+xhxJFjoF+qbtt7fynPXddIFMddM=; b=v4R3t4fhvQk3IF8Hz2kQlItU8ojq3E9mszdRxsx4OvXxsN0hQi5g7CpV9G6Id9+hJO oJIHCbx+2ec4A7HVYTyMUjUEfQUcx3P/81d5BFU5sNAu/NYj0/wswD8LvbOPPS0QV4wX Nwi5zuErEogN7iTkA7vqD2gTrE2YSBd286rNnD6Qa4khwaTjxVaLD8JpJ26Z0EyGgZa1 HsSg6OMHQXTRo892z2jypAlzv2inLRXy+K64aXN6yHxtReCRLj5hpU4VRo6fdEv7zfVF k12hiHpMUrJ/rZBsTfG5UArExmew8zH8DgIKT/WyfDAcr1nFDDp+78g0f0doxtdQksrL pSZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=OkBoy+h8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u8-20020a17090ae00800b002507cbb0090si1007483pjy.184.2023.05.15.23.00.58; Mon, 15 May 2023 23:01:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=OkBoy+h8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229836AbjEPF7D (ORCPT + 99 others); Tue, 16 May 2023 01:59:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229978AbjEPF7A (ORCPT ); Tue, 16 May 2023 01:59:00 -0400 Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 273033C3B for ; Mon, 15 May 2023 22:58:46 -0700 (PDT) Received: by mail-ed1-x52e.google.com with SMTP id 4fb4d7f45d1cf-50db7ec8188so12860136a12.2 for ; Mon, 15 May 2023 22:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1684216724; x=1686808724; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2XJHaG4c+GPnkNk+xhxJFjoF+qbtt7fynPXddIFMddM=; b=OkBoy+h8zYPwGCx8eMz7mb1OtidrsMcSveG7hGWIccAGeRKTASLiu54xrCyO6Xv+OA ymnn1EnS4kDDpoCV708X/Z0o0j+/cKlaieE4Ow26pa3+u5EQmQDp5ZPGg67pXN9Eqgpm XTtEZCjGmhvM0jJmvd1oU1JVgjYIsNPu6UGi2g31XfBloqf92U89u7NEv5nPcgpS09jY ccAxDnGeTtcJRmeg/mDt4mLRs71OI6VT3LGhSfst9ZTIf2hC2P3WprnhNrCb7ZTKKCHY pXjsoo/WCDEzrWVM3qP7SvHayTPOkcahRqisPo0yI+8vE2DVtEIY+/ttZuf1P//2zGkm il9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684216724; x=1686808724; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2XJHaG4c+GPnkNk+xhxJFjoF+qbtt7fynPXddIFMddM=; b=FRlzi3F47fStUJ/Bcx04XgKIiO59CujPFShsLrjMHFrj969gMTp7gtBnVaRPFFzdX5 ZmO3M00XtBwj0QsIu7pjnY8k0/j0YJ4qBYwpDNXI4R9PCklZY7yvPISiUbprUMmXdF4p 72jYkaOp3Zj+ee70QI5Yw+aiyeoiypZLNak70CBqh4pxa5uuOEeAdtpl8VUcoLhyuyIx PNubg96k7dqvPUdNPF0nMoT9zkZ7qSdDh0V4sxA6E2UZcaHwq84N6IbvtsK5CyR3vwSN /IGKOTkwO6UDqkCHn27TUWrEUDmThE5pN6T4JYNivxtBbrtCH7xDOU3XbG/oTHPizQ0w XnAQ== X-Gm-Message-State: AC+VfDyhFu8C7Fl1dpQ7s7qRUOFpczedig49bk/2IvVG6ycnAYwlmo7j +4ezf0zefO8FNkumzu8c3Dx6jB6IHP9QTBeCWorvHA== X-Received: by 2002:aa7:c991:0:b0:50d:88f3:2e30 with SMTP id c17-20020aa7c991000000b0050d88f32e30mr28134371edt.13.1684216724563; Mon, 15 May 2023 22:58:44 -0700 (PDT) MIME-Version: 1.0 References: <20230505173012.881083-1-etienne.carriere@linaro.org> <20230505173012.881083-3-etienne.carriere@linaro.org> In-Reply-To: From: Etienne Carriere Date: Tue, 16 May 2023 07:58:33 +0200 Message-ID: Subject: Re: [PATCH v6 3/4] tee: optee: support tracking system threads To: Sumit Garg Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, op-tee@lists.trustedfirmware.org, Jens Wiklander , Sudeep Holla , Cristian Marussi Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Sumit, On Mon, 15 May 2023 at 10:48, Sumit Garg wrote: > > On Fri, 12 May 2023 at 10:27, Etienne Carriere > wrote: > > > > On Thu, 11 May 2023 at 13:31, Sumit Garg wrote: > > > > > > On Thu, 11 May 2023 at 13:49, Etienne Carriere > > > wrote: > > > > > > > > On Thu, 11 May 2023 at 09:27, Sumit Garg wrote: > > > > > (snip) > > > > > > > > > > > > > > > > > > +bool optee_cq_inc_sys_thread_count(struct optee_call_queue *cq) > > > > > > > > > +{ > > > > > > > > > + bool rc = false; > > > > > > > > > + > > > > > > > > > + mutex_lock(&cq->mutex); > > > > > > > > > + > > > > > > > > > + /* Leave at least 1 normal (non-system) thread */ > > > > > > > > > > > > > > > > IMO, this might be counter productive. As most kernel drivers open a > > > > > > > > session during driver probe which are only released in the driver > > > > > > > > release method. > > > > > > > > > > > > > > It is always the case? > > > > > > > > > > > > This answer of mine is irrelevant. Sorry, > > > > > > Please read only the below comments of mine, especially: > > > > > > | Note that an OP-TEE thread is not bound to a TEE session but rather > > > > > > | bound to a yielded call to OP-TEE. > > > > > > > > > > > > > > > > > > > > > If the kernel driver is built-in then the session is > > > > > > > > never released. Now with system threads we would reserve an OP-TEE > > > > > > > > thread for that kernel driver as well which will never be available to > > > > > > > > regular user-space clients. > > > > > > > > > > > > > > That is not true. No driver currently requests their TEE thread to be > > > > > > > a system thread. > > > > > > > Only SCMI does because it needs to by construction. > > > > > > > > > > > > > > > > > Yes that's true but what prevents future/current kernel TEE drivers > > > > > from requesting a system thread once we have this patch-set landed. > > > > > > > > Only clients really needing this system_thread attribute should request it. > > > > If they really need, the OP-TEE firmware in secure world should > > > > provision sufficient thread context. > > > > > > How do we quantify it? We definitely need a policy here regarding > > > normal vs system threads. > > > > > > One argument in favor of kernel clients requiring system threads could > > > be that we don't want to compete with user-space for OP-TEE threads. > > > > Sorry I don't understand. What do you mean qualifying this? > > I mean we have to fairly allocate threads among system and non-system > thread invocations. > > > In an ideal situation, we would have OP-TEE provisioned with largely > > sufficient thread contexts. However there are systems with constraints > > memory resource that do lower at most the number of OP-TEE thread > > contexts. > > > > Yeah, I think we are on the same page here. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So I would rather suggest we only allow a > > > > > > > > single system thread to be reserved as a starting point which is > > > > > > > > relevant to this critical SCMI service. We can also make this upper > > > > > > > > bound for system threads configurable with default value as 1 if > > > > > > > > needed. > > > > > > > > > > > > Note that SCMI server can expose several SCMI channels (at most 1 per > > > > > > SCMI protocol used) and each of them will need to request a > > > > > > system_thread to TEE driver. > > > > > > > > > > > > Etienne > > > > > > > > > > > > > > > > > > > > Reserving one or more system threads depends on the number of thread > > > > > > > context provisioned by the TEE. > > > > > > > Note that the implementation proposed here prevents Linux kernel from > > > > > > > exhausting TEE threads so user space always has at least a TEE thread > > > > > > > context left available. > > > > > > > > > > Yeah but on the other hand user-space clients which are comparatively > > > > > larger in number than kernel clients. So they will be starved for > > > > > OP-TEE thread availability. Consider a user-space client which needs > > > > > to serve a lot of TLS connections just waiting for OP-TEE thread > > > > > availability. > > > > > > > > Note that OP-TEE default configuration provisions (number of CPUs + 1) > > > > thread context, so the situation is already present before these > > > > changes on systems that embedded an OP-TEE without a properly tuned > > > > configuration. As I said above, Linux kernel cannot be responsible for > > > > the total number of thread contexts provisioned in OP-TEE. If the > > > > overall system requires a lot of TEE thread contexts, one should embed > > > > a suitable OP-TEE firmware. > > > > > > Wouldn't the SCMI deadlock problem be solved with just having a lot of > > > OP-TEE threads? But we are discussing the system threads solution here > > > to make efficient use of OP-TEE threads. The total number of OP-TEE > > > threads is definitely in control of OP-TEE but the control of how to > > > schedule and efficiently use them lies with the Linux OP-TEE driver. > > > > > > So, given our overall discussion in this thread, how about the upper > > > bound for system threads being 50% of the total number of OP-TEE > > > threads? > > > > What would be a shame if the system does not use any Linux kernel > > client sessions, only userland clients. This information cannot be > > knwon be the linux optee driver. > > Instead of leaving at least 1 TEE thread context for regular session, > > what if this change enforce 2? or 3? Which count? > > I think 1 is a fair choice: it allows to support OP-TEE firmwares with > > a very small thread context pool (when running in small secure > > memory), embedding only 2 or 3 contextes. > > IMO, leaving only 1 thread for user-space will starve TLS based > applications. How about the following change on top of this patchset? > > diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c > index 8b8181099da7..1deb5907d075 100644 > --- a/drivers/tee/optee/call.c > +++ b/drivers/tee/optee/call.c > @@ -182,8 +182,8 @@ bool optee_cq_inc_sys_thread_count(struct > optee_call_queue *cq) > > mutex_lock(&cq->mutex); > > - /* Leave at least 1 normal (non-system) thread */ > - if (cq->res_sys_thread_count + 1 < cq->total_thread_count) { > + /* Leave at least 50% for normal (non-system) threads */ > + if (cq->res_sys_thread_count < cq->total_thread_count/2) { > cq->free_normal_thread_count--; > cq->res_sys_thread_count++; > rc = true; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Note that an OP-TEE thread is not bound to a TEE session but rather > > > > > > > bound to a yielded call to OP-TEE. > > > > > > > > > > tee_client_open_session() > > > > > -> optee_open_session() > > > > > > > > > > tee_client_system_session() > > > > > -> optee_system_session() > > > > > -> optee_cq_inc_sys_thread_count() <- At this point you > > > > > reserve a system thread corresponding to a particular kernel client > > > > > session > > > > > > > > > > All tee_client_invoke_func() invocations with a system thread capable > > > > > session will use that reserved thread. > > > > > > > > > > tee_client_close_session() > > > > > -> optee_close_session() > > > > > -> optee_close_session_helper() > > > > > -> optee_cq_dec_sys_thread_count() <- At this point the > > > > > reserved system thread is released > > > > > > > > > > Haven't this tied the system thread to a particular TEE session? Or am > > > > > I missing something? > > > > > > > > These changes do not define an overall single system thread. > > > > If several sessions requests reservation of TEE system thread, has > > > > many will be reserved. > > > > Only the very sessions with its sys_thread attribute set will use a > > > > reserved thread. If such a kernel client issues several concurrent > > > > calls to OP-TEE over that session, it will indeed consume more > > > > reserved system threads than what is actually reserved. Here I think > > > > it is the responsibility of such client to open as many sessions as > > > > requests. This is what scmi/optee driver does (see patch v6 4/4). An > > > > alternative would be to have a ref count of sys_thread in session > > > > contexts rather than a boolean value. I don't think it's worth it. > > > > > > Ah, I missed that during the review. The invocations with system > > > threads should be limited by res_sys_thread_count in a similar manner > > > as we do with normal threads via free_normal_thread_count. Otherwise, > > > it's unfair for normal thread scheduling. > > > > > > I suppose there isn't any interdependency among SCMI channels itself > > > such that a particular SCMI invocation can wait until the other SCMI > > > invocation has completed. > > > > I think that would over complexify the logic. > > > > We shouldn't allow system thread invocations to be greater than what > is actually reserved count for system threads. One thing I am not able > to understand here is why do you need a lot of system threads? Are > SCMI operations too expensive? I suppose those should just involve > configuring some register bits and using a single OP-TEE thread which > is invoked sequentially should be enough. Ok, I get your point. I think you're right, reserving at most 1 TEE thread for system sessions should be enough to prevent TEE entry calls deadlocks which is the purpose of these changee. Would you be ok if the following logic: optee driver would reserve at most 1 TEE call entry for system sessions. If at least 1 kernel client claims a system session, a TEE call entry is reserved to that purpose. Once all system sessions are closed, the TEE reserved system call entry is released. When a system thread calls the TEE, if the TEE system thread context is not already in use, then that client consumes the reserved entry. If the system thread context is already in use, then that client call is treated as a regular call: it calls the TEE and would return waiting for a free thread if no TEE thread context is available. Etienne > > -Sumit > > > Note I will send a patch v8 series but feel free to continue the discussion. > > It will at least address other comments you shared. > > > > Best regards, > > Etienne > > > > > > > > -Sumit