Received: by 2002:a05:7412:3210:b0:e2:908c:2ebd with SMTP id eu16csp658525rdb; Thu, 31 Aug 2023 23:35:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHRzsTGN3NzYsqdNgNUsIQqG6Gn0sn5kN/EfidQt36oqnTle+36ZhKNShVtQX9HtSOjCeiU X-Received: by 2002:a17:902:ce8d:b0:1b3:d8ac:8db3 with SMTP id f13-20020a170902ce8d00b001b3d8ac8db3mr2005541plg.6.1693550129172; Thu, 31 Aug 2023 23:35:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693550129; cv=none; d=google.com; s=arc-20160816; b=wyY7UN6I8RC3yhXyxv0AHgqhbRmKkAd/HITXNLOC8geAewl9nKCF19Ax2zz9O6o/L+ P+69Zhqq93RT/yY8cKF90i5togwvQ+HZ1145xn7NOsj5Xqpy9HcrkVOPY1TCUWyOkoFu CWVR/J/jtywhtGaK1ofxt2aSNm3RPt6f4AYS2Lxuhma3QNj9Dl8CcOxQDt5jZrHmWpKr R0rGo/UPXXU11Q1PYMgNWRwOjF5ctcd0uNDqQXI/cwe+LOVsN+w7+bgZcsIh8OLTYYrq /gO8n7oWJLlyzLMBct7gIG4a1KgVZppgaSQW6EEzASWJiwLIU25OCo9Vu57kuhct6bKA 7AuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=dd90/Y+A++n0omf8wNhZ8ZN66mp0+Dlw8p2m/MoC4rc=; fh=vUHBR6bbV/teZlawSm9lfifWi7lyugrnuufpUEvhnco=; b=FBrHhRoBiYkGjwXKuEOD6RdGMNzD9MvaOZB4ikwTAxwXl6uzbNqL9YOYjzVoottD/i uUZhs99NmH7/Uzld1/4LtLFeBTgyf/klQjiPzsMnDHkXZS/3tIlORuMjWFiqs9VDYEmW PWaf4LvcvAhxrNTTj559uoTBwDLoZdKSeVCGU4h8Kntd7Dial/ATB6BFZtYEfpjqPQGB /xw12GAdIyeFFdrcf5Ixcm3uyFPcdeBny2AKhuN9HrCUqRng7qvaB4CbfsZkihOGR23q V6W7ziIUSOLu23BXkVTZVTgT4kRBskWab4T2JvLLGQhL+pZ5LluM9SqZi8QPnw3VETcD exMw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o1-20020a17090323c100b001b9d203659dsi2200188plh.14.2023.08.31.23.35.03; Thu, 31 Aug 2023 23:35:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235244AbjIAC2P (ORCPT + 99 others); Thu, 31 Aug 2023 22:28:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231431AbjIAC2O (ORCPT ); Thu, 31 Aug 2023 22:28:14 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D41D7E6F for ; Thu, 31 Aug 2023 19:28:10 -0700 (PDT) Received: from dggpemm500024.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RcMNN12t5z1L9F0; Fri, 1 Sep 2023 10:26:28 +0800 (CST) Received: from [10.67.110.173] (10.67.110.173) by dggpemm500024.china.huawei.com (7.185.36.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 10:28:08 +0800 Message-ID: Date: Fri, 1 Sep 2023 10:28:08 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] crypto: Fix hungtask for PADATA_RESET Content-Language: en-US To: Herbert Xu , Lu Jialin CC: Steffen Klassert , Daniel Jordan , "David S . Miller" , References: <20230823073047.1515137-1-lujialin4@huawei.com> From: "Guozihua (Scott)" In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.110.173] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500024.china.huawei.com (7.185.36.203) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On 2023/8/23 17:28, Herbert Xu wrote: > On Wed, Aug 23, 2023 at 07:30:47AM +0000, Lu Jialin wrote: >> We found a hungtask bug in test_aead_vec_cfg as follows: >> >> INFO: task cryptomgr_test:391009 blocked for more than 120 seconds. >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> Call trace: >> __switch_to+0x98/0xe0 >> __schedule+0x6c4/0xf40 >> schedule+0xd8/0x1b4 >> schedule_timeout+0x474/0x560 >> wait_for_common+0x368/0x4e0 >> wait_for_completion+0x20/0x30 >> test_aead_vec_cfg+0xab4/0xd50 >> test_aead+0x144/0x1f0 >> alg_test_aead+0xd8/0x1e0 >> alg_test+0x634/0x890 >> cryptomgr_test+0x40/0x70 >> kthread+0x1e0/0x220 >> ret_from_fork+0x10/0x18 >> Kernel panic - not syncing: hung_task: blocked tasks >> >> For padata_do_parallel, when the return err is 0 or -EBUSY, it will call >> wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal >> case, aead_request_complete() will be called in pcrypt_aead_serial and the >> return err is 0 for padata_do_parallel. But, when pinst->flags is >> PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it >> won't call aead_request_complete(). Therefore, test_aead_vec_cfg will >> hung at wait_for_completion(&wait->completion), which will cause >> hungtask. >> >> The problem comes as following: >> (padata_do_parallel) | >> rcu_read_lock_bh(); | >> err = -EINVAL; | (padata_replace) >> | pinst->flags |= PADATA_RESET; >> err = -EBUSY | >> if (pinst->flags & PADATA_RESET) | >> rcu_read_unlock_bh() | >> return err >> >> In order to resolve the problem, we retry at most 5 times when >> padata_do_parallel return -EBUSY. For more than 5 times, we replace the >> return err -EBUSY with -EAGAIN, which means parallel_data is changing, and >> the caller should call it again. > > Steffen, should we retry this at all? Or should it just fail as it > did before? > > Thanks, It should be fine if we don't retry and just fail with -EAGAIN and let caller handles it. It should not break the meaning of the error code. -- Best GUO Zihua