Received: by 2002:ac0:950c:0:0:0:0:0 with SMTP id f12csp1566500imc; Mon, 11 Mar 2019 17:28:22 -0700 (PDT) X-Google-Smtp-Source: APXvYqxXh7++ZXYlsqC1TqRfK3Fmt9rEjTuQOAes8ysvFZaJuNF0OUyTTxt0akIIfkHLaoreqQ3S X-Received: by 2002:a63:1061:: with SMTP id 33mr32042588pgq.226.1552350502009; Mon, 11 Mar 2019 17:28:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552350502; cv=none; d=google.com; s=arc-20160816; b=BEh1UhsMkOQg8KEJ1Ssd0jlG8TE9/dbH1YpAyvvhV3eSkinAfV6nAcQaRZPfJZLWr+ xOcYBskXN+XrbPRWAoR5QTCRwA133raKo3yJWrkVTKmNUvCfMpv3riyMFIdGAbtu8wa4 KBEXWX6XR0Lj52uGiMAReurlw8b2xdaCayf65RHDkGYKfSy//FrBE8fsgTCE54VX2ByS oe3QEm2otHZWvoqHLkNeZ3z9hqF+ctzLtIXmOBajJayXiEl3zv4iustwm8Vhx/zaaFji 6XrBccvdVH2ugzc3cDth/iyoH2pMNRMqVirADIlqPJUQQlrJHTtdGBfAJUbjeygVLLpw oBfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature; bh=WDKrOoENNOID0e9EjL/rfTF9r/SiivyYo8cXiPgGpe8=; b=x/oTXFTonDGiQdLLoB3By1yvuK67YhGkFkVtG0x8soKt4lE9FO640NVOHDqE2s7Lej rVoY635QFqgM0ymPFrLc2bl8j/zmXOiR33Jd5n5YpJpH1QePMwvbPvDKD0tjUwFfBFX4 KsNY3re3WR+FOvgpsedVU4Gc5CN+FrChOdlIpyWQEUjb/5Rci536b7krFe+2Xyf+R3AD 35grNTIqyfG/kznv3I8qBXi93YK0ANIFi1U8hrdmvDHV9WKtZ2K/29Qg1/p87HbP2kx2 NmrqmPWETwTl0dKjMt3HH/gu+OwwdsCi0IqNAU4tHpARqpT0ZRqq8rAv+bZnGjyAYm1r fPWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@hansenpartnership.com header.s=20151216 header.b="X5Xrgw/l"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=hansenpartnership.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o15si5724423pgv.435.2019.03.11.17.28.05; Mon, 11 Mar 2019 17:28:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@hansenpartnership.com header.s=20151216 header.b="X5Xrgw/l"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=hansenpartnership.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726501AbfCLA1q (ORCPT + 99 others); Mon, 11 Mar 2019 20:27:46 -0400 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:54858 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726411AbfCLA1q (ORCPT ); Mon, 11 Mar 2019 20:27:46 -0400 Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id 6AFE38EE14F; Mon, 11 Mar 2019 17:27:45 -0700 (PDT) Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id btyX2mUdqYuW; Mon, 11 Mar 2019 17:27:45 -0700 (PDT) Received: from [153.66.254.194] (unknown [50.35.68.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id D13618EE130; Mon, 11 Mar 2019 17:27:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1552350465; bh=vnClVX+46Zy+ewYZTrYhTy3tlexaWaxa6VVtGc2Z0oY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=X5Xrgw/l+HxS0AvfPBwZUGI5inzApyAg7e2Ekp8garBt9085Y7p9p/Y55g54yCOme ljIG3Hz91UcYKNLalX3AYb8y4rO1FbX4ohKNtQUDMVKmoEw3OS5M3ChA9MEdnhYJYa 3gJl7749KhLgJRrBHo7be7HfKF31qXZ540681Mn4= Message-ID: <1552350463.23859.8.camel@HansenPartnership.com> Subject: Re: [PATCH] tpm: Make timeout logic simpler and more robust From: James Bottomley To: Calvin Owens , Peter Huewe , Jarkko Sakkinen , Jason Gunthorpe Cc: Arnd Bergmann , Greg Kroah-Hartman , linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Date: Mon, 11 Mar 2019 17:27:43 -0700 In-Reply-To: <358e89ed2b766d51b5f57abf31ab7a925ac63379.1552348123.git.calvinowens@fb.com> References: <358e89ed2b766d51b5f57abf31ab7a925ac63379.1552348123.git.calvinowens@fb.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.6 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2019-03-11 at 16:54 -0700, Calvin Owens wrote: > e're having lots of problems with TPM commands timing out, and we're > seeing these problems across lots of different hardware (both v1/v2). > > I instrumented the driver to collect latency data, but I wasn't able > to find any specific timeout to fix: it seems like many of them are > too aggressive. So I tried replacing all the timeout logic with a > single universal long timeout, and found that makes our TPMs 100% > reliable. > > Given that this timeout logic is very complex, problematic, and > appears to serve no real purpose, I propose simply deleting all of > it. "no real purpose" is a bit strong given that all these timeouts are standards mandated. The purpose stated by the standards is that there needs to be a way of differentiating the TPM crashed from the TPM is taking a very long time to respond. For a normally functioning TPM it looks complex and unnecessary, but for a malfunctioning one it's a lifesaver. Could you first check it's not a problem we introduced with our polling changes? My nuvoton still doesn't work properly with the default poll timings but it works flawlessly if I use the patch below. I think my nuvoton is a bit out of spec (it's a very early model that was software upgraded from 1.2 to 2.0) because no-one else on the list seems to see the problems I see, but perhaps you are. James --- From 249d60a9fafa8638433e545b50dab6987346cb26 Mon Sep 17 00:00:00 2001 From: James Bottomley Date: Wed, 11 Jul 2018 10:11:14 -0700 Subject: [PATCH] tpm.h: increase poll timings to fix tpm_tis regression tpm_tis regressed recently to the point where the TPM being driven by it falls off the bus and cannot be contacted after some hours of use. This is the failure trace: jejb@jarvis:~> dmesg|grep tpm [ 3.282605] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2) [14566.626614] tpm tpm0: Operation Timed out [14566.626621] tpm tpm0: tpm2_load_context: failed with a system error -62 [14568.626607] tpm tpm0: tpm_try_transmit: tpm_send: error -62 [14570.626594] tpm tpm0: tpm_try_transmit: tpm_send: error -62 [14570.626605] tpm tpm0: tpm2_load_context: failed with a system error -62 [14572.626526] tpm tpm0: tpm_try_transmit: tpm_send: error -62 [14577.710441] tpm tpm0: tpm_try_transmit: tpm_send: error -62 ... The problem is caused by a change that caused us to poke the TPM far more often to see if it's ready. Apparently something about the bus its on and the TPM means that it crashes or falls off the bus if you poke it too often and once this happens, only a reboot will recover it. The fix I've come up with is to adjust the timings so the TPM no longer falls of the bus. Obviously, this fix works for my Nuvoton NPCT6xxx but that's the only TPM I've tested it with. Fixes: 424eaf910c32 tpm: reduce polling time to usecs for even finer granularity Signed-off-by: James Bottomley diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h index 4b104245afed..a6c806d98950 100644 --- a/drivers/char/tpm/tpm.h +++ b/drivers/char/tpm/tpm.h @@ -64,8 +64,8 @@ enum tpm_timeout { TPM_TIMEOUT_RETRY = 100, /* msecs */ TPM_TIMEOUT_RANGE_US = 300, /* usecs */ TPM_TIMEOUT_POLL = 1, /* msecs */ - TPM_TIMEOUT_USECS_MIN = 100, /* usecs */ - TPM_TIMEOUT_USECS_MAX = 500 /* usecs */ + TPM_TIMEOUT_USECS_MIN = 750, /* usecs */ + TPM_TIMEOUT_USECS_MAX = 1000, /* usecs */ }; /* TPM addresses */