Received: by 2002:a05:7412:798b:b0:fc:a2b0:25d7 with SMTP id fb11csp386008rdb; Thu, 22 Feb 2024 06:54:39 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVInw4Tej3WiPEwt1MsDx+UikkSFMb1wUVwx3kzzXHomFVI+qwyK6yQAXJIlKHaThWASHvy+DIFXfamSJB/NsWeds9Z9Qlua6iIWUHP0Q== X-Google-Smtp-Source: AGHT+IEJqQzfE7hgqk1MZyx+ZmH6BsEsn6jbvFXH/XM3X6sGNwJId8ASbwHxnMQqP1Vhu3JY5z99 X-Received: by 2002:a05:6a21:1583:b0:1a0:a082:b2b8 with SMTP id nr3-20020a056a21158300b001a0a082b2b8mr11702807pzb.33.1708613678879; Thu, 22 Feb 2024 06:54:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708613678; cv=pass; d=google.com; s=arc-20160816; b=Xo9bpy+6vXlfkQhm9lak0dAPOqgdzGvDF19yud1U4RJOzpu5LCnxi7vDjiwgNCL5Ss nvlO7ByUydvxAZ/jHQZi0mksny0lWrnDyl8BBtR3HoFvYu/NGIyzPpzGXuqJZBSvkHnj hf6gmMdKRKaODoSZrRXivr6GQOk4dnMcZGxV3VPY8AEe0ad0lOMVdzCLWHQ48lSmlBjg i0wzxCvBEt7sPLb8wdhG//p/sOeZkGi07AM/8qyKzI5d9jZNgXg1fFaYAf30l1D5slAL jVDf2TdFqHU/lXAzr5aC30YljknIX28iIPqViPOX90aJDZMjIhpagDHUH5RcCU1XV+sO wJ+w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=dkim-signature:content-transfer-encoding:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:message-id:cc:to :subject:from:date; bh=igZEZiYHQbPwrwUgPmRPmdTd2wHkWHKalOw4oRzlFi0=; fh=4va7Cnqifg1+fVzZhVTqseEo1WFmy0SDwmprHfb/GPQ=; b=ekF+0FDfkMdFb5hRRAUQcYj2MxY2Veoiy4/GbeqQJNioH5YVkVVesVkAXKZ9m0NyKo h1zyDeHQoxsUJ2GeiJR1+MleyHB65MVFNgTy2WWDotKBRdd1BbBJ18Qj7w4B89WwTTRb n/Lal38IzDP2QwOVm+kIbJJ8Jc9f1Kn3k0tkIbTGdFqA9dYLcSS19bmpc1vE6Gmno3rp r8MOKtakYpUuZQy3I4yznjRAe7XKtKko4rEFM1VGsQHmzqpXH7QjdfMZi28Kpa2xqmLM 6fDy4urA5js8Yco5rJBkqG1wovP8e/vpGQ73oQSiWphCtsU/sb1WWrvQNk6S/7tLwgZk Cifw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@poczta.fm header.s=dk header.b=W51mdK4P; arc=pass (i=1 spf=pass spfdomain=poczta.fm dkim=pass dkdomain=poczta.fm dmarc=pass fromdomain=poczta.fm); spf=pass (google.com: domain of linux-nfs+bounces-2048-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-nfs+bounces-2048-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=poczta.fm Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id e18-20020a637452000000b005cec86b7a50si10457096pgn.570.2024.02.22.06.54.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Feb 2024 06:54:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs+bounces-2048-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@poczta.fm header.s=dk header.b=W51mdK4P; arc=pass (i=1 spf=pass spfdomain=poczta.fm dkim=pass dkdomain=poczta.fm dmarc=pass fromdomain=poczta.fm); spf=pass (google.com: domain of linux-nfs+bounces-2048-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-nfs+bounces-2048-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=poczta.fm Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 4B89128B2D1 for ; Thu, 22 Feb 2024 14:54:38 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 74F6912D773; Thu, 22 Feb 2024 14:54:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=poczta.fm header.i=@poczta.fm header.b="W51mdK4P" X-Original-To: linux-nfs@vger.kernel.org Received: from smtpo69.interia.pl (smtpo69.interia.pl [217.74.67.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 27BBE1474C2 for ; Thu, 22 Feb 2024 14:54:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.74.67.69 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708613676; cv=none; b=TwJgI2m8SktqwYRnb/4w+iu7FOeSjc+WJF19hs2aqfNvaTcCWNcr0F4rbdNnIvOArC2VS1YvO/KL0kQuWclTR2ZyJaGx7Udljg5Rpu1ULPdz3u1BmS1fsgqjzKMRHTVydku9/tRmAnt+6DCdKskp5sja+j4A7coHHN0WUAcL35A= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708613676; c=relaxed/simple; bh=xYWAVpd9Cc6WP3xWFyNsRtBkl6jURusWh74oDuQxeos=; h=Date:From:Subject:To:Cc:Message-Id:MIME-Version:Content-Type; b=TO+ZlyE3nkhHCAkLCjgeNZtEwpx2RpepZO93Wz6xZE48EJF4GBfIQPL9WqrMVPq+u0bQf8TnF/9cIWplaMQXX7LfhTY5XVuZjTpP4jElOc/wJSEkDpxk07VL1SEAYWU9h93ap0Y679+ny857myum1s3Z44szb8Y+eQ9y/IMy2K4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=poczta.fm; spf=pass smtp.mailfrom=poczta.fm; dkim=pass (1024-bit key) header.d=poczta.fm header.i=@poczta.fm header.b=W51mdK4P; arc=none smtp.client-ip=217.74.67.69 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=poczta.fm Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=poczta.fm Date: Thu, 22 Feb 2024 15:54:26 +0100 From: Jacek Tomaka Subject: NFS data corruption on congested network To: trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, neilb@suse.de Cc: linux-nfs@vger.kernel.org X-Mailer: interia.pl/pf09 Message-Id: Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=poczta.fm; s=dk; t=1708613670; bh=igZEZiYHQbPwrwUgPmRPmdTd2wHkWHKalOw4oRzlFi0=; h=Date:From:Subject:To:Message-Id:MIME-Version:Content-Type; b=W51mdK4PRrf4K+ohFh3Tny8H9DYBJe9VZijH9PoCbTR4w4qg6/1mViriCn5JTukBD RSrxJ0VD3c0TWeczBNIJ6Pg3Uc9Z3krzf2gvAz+dm2mQpXJAXgvGL1+PAhlVwnT/WD xfvf/Js45a3j+BitCu4sVAFw2ZtXBTvKDdGQSVOI= Hello, I ran into an issue where the NFS file ends up being corrupted on disk. We started noticing it on certain, quite old hardware after upgrading OS from Centos 6 to Rocky 9.2. We do see it on Rocky 9.3 but not on 9.1. After some investigation we have reasons to believe that the change was introduced by the following commit: https://github.com/torvalds/linux/commit/6df25e58532be7a4cd6fb15bcd85805947402d91 We write a number of files on a single thread. Each file is up to 4GB. Before closing we call fdatasync. Sometimes the file ends up being corrupted. The corruptions is in a form of a number ( more than 3k pages in one case) of zero filled pages. When this happens the file cannot be deleted from the client machine which created the file, even when the process which wrote the file completed successfully. The machines have about 128GB of memory, i think and probably network that leaves to be desired. My reproducer is currently tied up to our internal software, but i suspect setting the write_congested flag randomly should allow to reproduce the issue. Regards. Jacek Tomaka