Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp60920pxk; Wed, 16 Sep 2020 18:55:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzqQ1vvdxk7PbbvEelwoXcMZHx6MHiKnzmH7VTdtXD5dJoQmM5996MNblvTcZsUd9E8tSTX X-Received: by 2002:aa7:c7c2:: with SMTP id o2mr31752747eds.366.1600307716751; Wed, 16 Sep 2020 18:55:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600307716; cv=none; d=google.com; s=arc-20160816; b=PM4UVejkJyZKJovoK69LBmC3ekmf/ZS14suoL+Y63K8E3VnjgjyykIDFqzZYdoHt2G mXKz646V351ktFmCmbxMsCqXG+KvaKBGErkPyqVxYM/j98d+lBLRnVJWTtgZVMgn42Y8 gaHOlnzpZDU/nBzc+hkkoJF+6puf+3F54r/WXIvZ+D0yJQgJLJslRyMIvmKHzHZcLBN4 pzp2yxu21gmfbyULN/8jFaclLSA6/ILNKGA9tdKR+ADSJ+1CFMB3caCmxiRc5L2t7jHJ qdpkmj5DmL0jDzEpbcOBhWFbwe2J/XbcAap3iOP9TdlWMErdfxEB+mdlYq27+rBNukQN nXOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:message-id:references:in-reply-to :subject:cc:to:from:date:content-transfer-encoding:mime-version :dkim-signature; bh=GC4S6qLxDWwr8RfLIrHwdZdYoHTS/IPnUY00JKRfmIw=; b=Lab7SSi8KP24/rbUj0MU74NTmuUhuwfSzx7hu6LQGPUS9SE37tHPZvmfUfRoQLMBZZ b+JHWil+BQE8TI5HfgKx5lyewhqu3v1taYqcfg+Tmn2pmBFZKmBAjrEdDYaFwo1uq5pY g5kxDYCnHbNhgrmc1Mx04pMHbvF+izEPSr8yFVbYcGwdSpgGkCGkX5UjQX8Yj8pXzVlE BQohPiUdxLEp0Lv5ZQCbqo30kPtUSktPRqBEOE3jdmJYx97T/8klFvP5aqA+Bvdzz50B a2LCebAPlgZFcviEXnJAxJK5QzFNGTVuZ3a1E6VJRKpnDovTmEbfNZ7P1N93/yS8zmIS a02A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=Gc9E0J5c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cf14si12682863ejb.441.2020.09.16.18.54.46; Wed, 16 Sep 2020 18:55:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@mg.codeaurora.org header.s=smtp header.b=Gc9E0J5c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726072AbgIQBvp (ORCPT + 99 others); Wed, 16 Sep 2020 21:51:45 -0400 Received: from mail29.static.mailgun.info ([104.130.122.29]:20213 "EHLO mail29.static.mailgun.info" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726009AbgIQBvm (ORCPT ); Wed, 16 Sep 2020 21:51:42 -0400 X-Greylist: delayed 312 seconds by postgrey-1.27 at vger.kernel.org; Wed, 16 Sep 2020 21:50:29 EDT DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1600307501; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=GC4S6qLxDWwr8RfLIrHwdZdYoHTS/IPnUY00JKRfmIw=; b=Gc9E0J5c/ygBXfNysr8rtuYkQSRioVcuSeSmKWTWG8H2VmorI7c4HrnCIOO2dmt0C+iyTVcO wfgxzkmxGsa7TZOdYTS/moJe2M84CrX+IkYb9WmHnOdZ9lm48WBBbpB8WVEuqlYmhFN5+26x S51kZLuOmxVhwjHXktfGMjF7hdQ= X-Mailgun-Sending-Ip: 104.130.122.29 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n03.prod.us-east-1.postgun.com with SMTP id 5f62c12df1e3eb89c704d262 (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Thu, 17 Sep 2020 01:51:41 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 7B80EC433F1; Thu, 17 Sep 2020 01:51:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: mansur) by smtp.codeaurora.org (Postfix) with ESMTPSA id 0C7ECC433F0; Thu, 17 Sep 2020 01:51:39 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 17 Sep 2020 07:21:39 +0530 From: mansur@codeaurora.org To: Stanimir Varbanov Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, vgarodia@codeaurora.org Subject: Re: [PATCH v2 1/3] venus: core: handle race condititon for core ops In-Reply-To: <313cf565-f69f-df84-6bff-8c9a77b9f642@linaro.org> References: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> <1599741856-16239-2-git-send-email-mansur@codeaurora.org> <313cf565-f69f-df84-6bff-8c9a77b9f642@linaro.org> Message-ID: X-Sender: mansur@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-09-11 15:40, Stanimir Varbanov wrote: > On 9/10/20 3:44 PM, Mansur Alisha Shaik wrote: >> For core ops we are having only write protect but there >> is no read protect, because of this in multthreading >> and concurrency, one CPU core is reading without wait >> which is causing the NULL pointer dereferece crash. >> >> one such scenario is as show below, where in one CPU >> core, core->ops becoming NULL and in another CPU core >> calling core->ops->session_init(). >> >> CPU: core-7: >> Call trace: >> hfi_session_init+0x180/0x1dc [venus_core] >> vdec_queue_setup+0x9c/0x364 [venus_dec] >> vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common] >> vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2] >> v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem] >> v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem] >> v4l_reqbufs+0x4c/0x5c >> __video_do_ioctl+0x2b0/0x39c >> >> CPU: core-0: >> Call trace: >> venus_shutdown+0x98/0xfc [venus_core] >> venus_sys_error_handler+0x64/0x148 [venus_core] >> process_one_work+0x210/0x3d0 >> worker_thread+0x248/0x3f4 >> kthread+0x11c/0x12c >> >> Signed-off-by: Mansur Alisha Shaik >> Acked-by: Stanimir Varbanov >> --- >> Changes in V2: >> - Addressed review comments by stan by validating on top >> - of >> https://lore.kernel.org/patchwork/project/lkml/list/?series=455962 >> >> drivers/media/platform/qcom/venus/hfi.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/media/platform/qcom/venus/hfi.c >> b/drivers/media/platform/qcom/venus/hfi.c >> index a59022a..3137071 100644 >> --- a/drivers/media/platform/qcom/venus/hfi.c >> +++ b/drivers/media/platform/qcom/venus/hfi.c >> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create); >> int hfi_session_init(struct venus_inst *inst, u32 pixfmt) >> { >> struct venus_core *core = inst->core; >> - const struct hfi_ops *ops = core->ops; >> + const struct hfi_ops *ops; >> int ret; >> > > If we are in system error recovery the session_init cannot pass > successfully, so we exit early in the function. > > I'd suggest to make it: > > /* If core shutdown is in progress or we are in system error > recovery, > return an error */ > mutex_lock(&core->lock); > if (!core->ops || core->sys_error) { > mutex_unclock(&core->lock); > return -EIO; > } > mutex_unclock(&core->lock); > Tried above suggestion and ran the failed scenario, I didn't see any issue. Posted new version https://lore.kernel.org/patchwork/project/lkml/list/?series=463091 >> if (inst->state != INST_UNINIT) >> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, >> u32 pixfmt) >> inst->hfi_codec = to_codec_type(pixfmt); >> reinit_completion(&inst->done); >> >> + mutex_lock(&core->lock); >> + ops = core->ops; >> ret = ops->session_init(inst, inst->session_type, inst->hfi_codec); >> if (ret) >> return ret; >> >> + mutex_unlock(&core->lock); >> ret = wait_session_msg(inst); >> if (ret) >> return ret; >>