Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp4964994imm; Fri, 18 May 2018 13:54:52 -0700 (PDT) X-Google-Smtp-Source: AB8JxZotv57SlUqScoEd8tB+hAOJ4WtuZzTTBer1gGytQaJrte4KCYAjwChKv5c+WGC15zaOZQKP X-Received: by 2002:a63:a06a:: with SMTP id u42-v6mr8297232pgn.389.1526676892209; Fri, 18 May 2018 13:54:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526676892; cv=none; d=google.com; s=arc-20160816; b=x5q0Zyxu8jjKHyHn8fbMVRYRcmDEK1Fu7UA74mIBEekA9jTljIFPFRp8d7JeWRgNOi JAjo3M9+hNezU8u2uqLNM1g8pFcuiSEtA3JZ01CXljIFphTU22l4gVhvpf2hLrCcYkfb kEBHZKif/EmLi4OEuOFiCw4cW0ZuiiFQv4yOig5DmKT2diTsPs7GMhrfsTnjgtx9GNx8 H5amMibSfl5VKtuix6GpuG8Yyy9btK1DXqUHEDh3YJnhUP4rG8LmV7LjUXjIN/nSJja+ Zx02sS8xJHtimF63ddW496gNy1KKt0nhQMBl8xwTxWa9Ab3eWU9CA/qc6cMwcCiiCU+f XUgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:to:cc:in-reply-to:date:subject :mime-version:message-id:from:dkim-signature :arc-authentication-results; bh=8klocdz/5KqLPQgxIImlMN6xosZku7pAl2mE0zQFTfU=; b=bcrgb0aMwlTo5YmHzbKQyBP0ETYwvE/M6gtHudfbCA7DkZyJw7ywpvWZKvWUJd0ycg Y9Bqj5O6t+fIbnZulee4neC34JfeUpRmpV7Ayor0GaavBvvIV/Nvds4Y+9N2ovJQIBXk 9h7CaRAGG1601DZClXA4L/UBnAGEuQmO5THhiFHKIJssy+GEnXV2iNsD7aTafj+Bzxsy 89zD+yubY2egUxHsgYU/s3IghyesxAf+tgpcioTuD6IN6On0ZdA+te1jiWhHqIIoaBnu JKFs7ecMfHzCjqJ4X3AVjAGFslRTc3yItjseISMh/ky5qxtAzpBvp3rev1eRJbhePvxE 6AWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dilger-ca.20150623.gappssmtp.com header.s=20150623 header.b=SVALQJS+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e1-v6si8400952plk.397.2018.05.18.13.54.37; Fri, 18 May 2018 13:54:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@dilger-ca.20150623.gappssmtp.com header.s=20150623 header.b=SVALQJS+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752403AbeERUy1 (ORCPT + 99 others); Fri, 18 May 2018 16:54:27 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:33180 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751750AbeERUyX (ORCPT ); Fri, 18 May 2018 16:54:23 -0400 Received: by mail-pf0-f195.google.com with SMTP id a20-v6so4300679pfo.0 for ; Fri, 18 May 2018 13:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dilger-ca.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=8klocdz/5KqLPQgxIImlMN6xosZku7pAl2mE0zQFTfU=; b=SVALQJS+zHaas0hQuoJvOUm0Y+RROh9mpiX7y7M2bxdqfe9cswllzEC7xZ2Ab/yPoV XasjySoEG7uH/idTorERgpbdr9Q3cUYEmsDov5rvas9fK0ClO1hxIPypsP/92TUdSrBk H3ziheFmPyNjlPYHBR+XAoMZId+BX141SmhidXwooVmKY3je+vmPbTFBhcwp08a/Mttp x/h5tr7ilQ7nUjxBd2SqJkp1RT0AuZWumuc7r07lfjtmqTuBlsi2Ijjp84LCWvAqXPmx oeCGQucR+ylXeYZleungGa1+3JqqITaHZIpedHbDioz+UC86bq5SNgbq9kjgG0Zq3pUT 5VWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=8klocdz/5KqLPQgxIImlMN6xosZku7pAl2mE0zQFTfU=; b=elA4scvRddGMN9I1FUv+ZBYziBPsOZagct8ZhtC3I2kE8AxgoZys3gdpOmno6IWHyl D9AvfjRH4rfwfItGI15C19f0hsdFdq8p6geZ4O1LZWkeJp41DK0agu45gSPROKmREekJ 86u+Hoo7tx1PNE+lVCJP2xA5FpparNxqHX6JrXPAieKyoagEMmEbw5qWgbvAPf1FcTl5 hx/sXYhRKhzz1gZ3pReWIilPGFRi9l4WpiM9P8oDYx6a2PEKFFdwOU7xxw/1MFl7E87u 1309Iy1NefyoN7wzxxIbBT152xE90BT1j+k+YNUayOa8OZkyANfLmQ4y8rzgJ0VPS5BX 6Azg== X-Gm-Message-State: ALKqPwcpPzzBXxcnegzGRtkanMwnf+8gLImS8BwNTAXbiyEI/1ijKOB5 YgFMowmi7cEQrCgNyPO0NavMIg== X-Received: by 2002:a62:9342:: with SMTP id b63-v6mr10924489pfe.130.1526676862690; Fri, 18 May 2018 13:54:22 -0700 (PDT) Received: from cabot-100.adilger.int (S0106a84e3fe4b223.cg.shawcable.net. [70.77.216.213]) by smtp.gmail.com with ESMTPSA id t3-v6sm8524690pgp.52.2018.05.18.13.54.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 May 2018 13:54:21 -0700 (PDT) From: Andreas Dilger Message-Id: Content-Type: multipart/signed; boundary="Apple-Mail=_FE2FFAF4-8F19-49C1-B9BD-6486E8CE283D"; protocol="application/pgp-signature"; micalg=pgp-sha256 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: [PATCH 10/10] Dynamic fault injection Date: Fri, 18 May 2018 14:54:18 -0600 In-Reply-To: <20180518191040.GG31737@kmo-pixel> Cc: LKML , fsdevel , Andrew Morton , Dave Chinner , darrick.wong@oracle.com, tytso@mit.edu, linux-btrfs@vger.kernel.org, clm@fb.com, jbacik@fb.com, viro@zeniv.linux.org.uk, willy@infradead.org, peterz@infradead.org To: Kent Overstreet References: <20180518074918.13816-1-kent.overstreet@gmail.com> <20180518074918.13816-21-kent.overstreet@gmail.com> <905DA1CC-63F8-4020-A1D7-1F59ABDF3448@dilger.ca> <20180518191040.GG31737@kmo-pixel> X-Mailer: Apple Mail (2.3273) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --Apple-Mail=_FE2FFAF4-8F19-49C1-B9BD-6486E8CE283D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii On May 18, 2018, at 1:10 PM, Kent Overstreet = wrote: >=20 > On Fri, May 18, 2018 at 01:05:20PM -0600, Andreas Dilger wrote: >> On May 18, 2018, at 1:49 AM, Kent Overstreet = wrote: >>>=20 >>> Signed-off-by: Kent Overstreet >>=20 >> I agree with Christoph that even if there was some explanation in the = cover >> letter, there should be something at least as good in the patch = itself. The >> cover letter is not saved, but the commit stays around forever, and = should >> explain how this should be added to code, and how to use it from = userspace. >>=20 >>=20 >> That said, I think this is a useful functionality. We have something = similar >> in Lustre (OBD_FAIL_CHECK() and friends) that is necessary for being = able to >> test a distributed filesystem, which is just a CPP macro with an = unlikely() >> branch, while this looks more sophisticated. This looks like it has = some >> added functionality like having more than one fault enabled at a = time. >> If this lands we could likely switch our code over to using this. >=20 > This is pretty much what I was looking for, I just wanted to know if = this > patch was interesting enough to anyone that I should spend more time = on it > or just drop it :) Agreed on documentation. I think it's also worth > factoring out the functionality for the elf section trick that dynamic > debug uses too. >=20 >> Some things that are missing from this patch that is in our code: >>=20 >> - in addition to the basic "enabled" and "oneshot" mechanisms, we = have: >> - timeout: sleep for N msec to simulate network/disk/locking delays >> - race: wait with one thread until a second thread hits matching = check >>=20 >> We also have a "fail_val" that allows making the check conditional = (e.g. >> only operation on server "N" should fail, only RPC opcode "N", etc). >=20 > Those all sound like good ideas... fail_val especially, I think with = that > we'd have all the functionality the existing fault injection framework = has > (which is way too heavyweight to actually get used, imo) The other thing that we have that is slightly orthogonal to your modes, which is possible because we just have a __u32 for the fault location, is that the "oneshot" mode is just a mask added to the fault location together with "fail_val" is that we can add other masks "fail N times", "fail randomly 1/N times", or "pass N times before failure". The other mask is set in the kernel when the fault was actually hit, so that test scripts can poll until that happens, and then continue running. The "fail randomly 1/N times" was useful for detecting memory allocation failure handling under load, but that has been superseded by the same functionality in kmalloc(), and it sounds like your fault injection can do this deterministically for every allocation? Cheers, Andreas --Apple-Mail=_FE2FFAF4-8F19-49C1-B9BD-6486E8CE283D Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org iQIzBAEBCAAdFiEEDb73u6ZejP5ZMprvcqXauRfMH+AFAlr/PXoACgkQcqXauRfM H+DdRQ/+KUz1HI3KOGg9OyrM1tcKHlrTNWjy4CTiNxN1xWcuMCbynur5WhiY/vvr xbCooFEWoxAbyZC8l3Tkt0J4Z9bnb7E/BlEpLHBaYc7HNhfF3gi3dsTZGHfSla1M 1xhpsDM97PaaLNbh/9pSg5JjyxbpHdlaelZM46W8ZySSAYUwR3wRjqR1Ppuu98cD CmLUlIkvFURmuofr25WzKe4+VTVMjTROrsmyRkhPfyxGe95tOxmud/s2QpxHsm6k pHs0oK2gXDr9f+BC1T44dWQa2Q5R7hGpP1zpB6pDNIbIyUTvCmeAUc/KIGJDhKBz 8w126dgyVYYepaygyVnVSSZQYV8naFHCY5/eqzrfDsugydtKrv02GF8CUarJqHmz Usc5EkXpAs14jQDguSIhgT0mJvGL+krBN4CyOAeAu2dQh29pXzjvMa7EiVDKPcMu TKgqHXMkN+azTFgft20Gx5Sa/WMpvru4OzEAUUor/27PNRyt2kinI33HCvedmckd bKrun2zhbDOF2F0HuoPfNx5V34g14fgU+CGga/nFmdIck8U+4C5AjyDM3XKT65xm h/iTfXik2Saj74yk3o7EtT224BWIsqAw1BnUgZgWqMPeb/qtxL6cjjO65HKyD7OI 1z/6tdroVcdGzw1PETdzvt1qPZbFSSP5agxG90GWF7SOheY2fQo= =FNyV -----END PGP SIGNATURE----- --Apple-Mail=_FE2FFAF4-8F19-49C1-B9BD-6486E8CE283D--