Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp1420872ybc; Tue, 19 Nov 2019 21:01:33 -0800 (PST) X-Google-Smtp-Source: APXvYqzOVnZMm8HKNkK9U9l3NIks7/rDvwT4BVGPT40Y3QINA1hxODUtfLtf0i74sbUKI1d18NWD X-Received: by 2002:a17:906:b25a:: with SMTP id ce26mr2432940ejb.13.1574226093145; Tue, 19 Nov 2019 21:01:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574226093; cv=none; d=google.com; s=arc-20160816; b=XMcP5HuZF0N9qxiK8vsNsAmr58HanUOzwvA2jwxw0BQAkCeZysHKQwc5Oikxqpc2V1 YN4fwQTfyZ7j4qeeDlS6gV4io5HoU0V65Wv28r984oprPahmM78BY52+H9JLkx2FEl3G kDRQTUyjZ2g2wRFxmxWY/6y6qFq8QSG4RgCUBXwfp09mDKoVqrp/ICgbMJnc6rtRtpT7 0PVb2/wd0h3vEuCEk+GK5z0fCp9eOt3LaVCAMWvD1v+V58QtZHdxSj2YdIYUo2eT86rg HD4fju1qKMj8i8WiTkpX4zZLzGPuOClgp+rcrkN40B3mfIJGKZlSyCSLIbmD0xNOQmux 3sFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :mime-version:references:in-reply-to:date:subject:cc:to:from; bh=sbOVBQAdS0SHmXn7PySC06tntz+8UgUxFs1QCIn5Ii0=; b=RPo/OAtW3fwV+h9qo+2Bb8E4liC5YRp6EugiaIzNLxx/YH3JCO+cru30q8UqEEMgCa VVRtuBTzbOwAFYKatmAYIMOh8H+jHoihy69yo9kjhcym/Yym14e7GY6syubx1INyDMFy hTs/ZtnO7Q797u2NyeCQIJXSb8uNZYmDC5JNdayoH5X/5AW4UX67DVExPsw031yLKmQX JTCMQ6Cwqxq4N/IJrn6Gm28C0Q2rA9GUtpdEDSqgNeVQjOk8ALFHuZpMoe1TTEY93rSW A3PmAZy5mQeaZEnCvA1XQiD6hcJASiz8jPzb9cYrjpCjGRwctxaIqqYMxMSZjp9mZ/OF uLHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a23si15404005ejj.373.2019.11.19.21.01.08; Tue, 19 Nov 2019 21:01:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726220AbfKTFAo (ORCPT + 99 others); Wed, 20 Nov 2019 00:00:44 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52046 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725263AbfKTFAn (ORCPT ); Wed, 20 Nov 2019 00:00:43 -0500 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id xAK4wcjW119865 for ; Wed, 20 Nov 2019 00:00:41 -0500 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2way22g4xx-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 20 Nov 2019 00:00:41 -0500 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 20 Nov 2019 05:00:39 -0000 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 20 Nov 2019 05:00:37 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id xAK50atA38994002 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Nov 2019 05:00:36 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 84385A405B; Wed, 20 Nov 2019 05:00:36 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 38DFBA404D; Wed, 20 Nov 2019 05:00:35 +0000 (GMT) Received: from localhost.localdomain.com (unknown [9.199.63.56]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 20 Nov 2019 05:00:34 +0000 (GMT) From: Ritesh Harjani To: jack@suse.cz, tytso@mit.edu, linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, mbobrowski@mbobrowski.org, riteshh@linux.ibm.com Subject: [RFCv3 4/4] ext4: Move to shared iolock even without dioread_nolock mount opt Date: Wed, 20 Nov 2019 10:30:24 +0530 X-Mailer: git-send-email 2.21.0 In-Reply-To: <20191120050024.11161-1-riteshh@linux.ibm.com> References: <20191120050024.11161-1-riteshh@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 x-cbid: 19112005-0020-0000-0000-0000038BE5BA X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19112005-0021-0000-0000-000021E2154A Message-Id: <20191120050024.11161-5-riteshh@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2019-11-19_08:2019-11-15,2019-11-19 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=735 malwarescore=0 bulkscore=0 suspectscore=0 spamscore=0 adultscore=0 clxscore=1015 mlxscore=0 priorityscore=1501 impostorscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1910280000 definitions=main-1911200045 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org We were using shared locking only in case of dioread_nolock mount option in case of DIO overwrites. This mount condition is not needed anymore with current code, since:- 1. No race between buffered writes & DIO overwrites. Since buffIO writes takes exclusive locks & DIO overwrites will take share locking. Also DIO path will make sure to flush and wait for any dirty page cache data. 2. No race between buffered reads & DIO overwrites, since there is no block allocation that is possible with DIO overwrites. So no stale data exposure should happen. Same is the case between DIO reads & DIO overwrites. 3. Also other paths like truncate is protected, since we wait there for any DIO in flight to be over. 4. In case of buffIO writes followed by DIO reads: Since here also we take exclusive locks in ext4_write_begin/end(). There is no risk of exposing any stale data in this case. Since after ext4_write_end, iomap_dio_rw() will wait to flush & wait for any dirty page cache data. Signed-off-by: Ritesh Harjani --- fs/ext4/file.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 18cbf9fa52c6..b97efc89cd63 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -383,6 +383,17 @@ static const struct iomap_dio_ops ext4_dio_write_ops = { .end_io = ext4_dio_write_end_io, }; +static bool ext4_dio_should_shared_lock(struct inode *inode) +{ + if (!S_ISREG(inode->i_mode)) + return false; + if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) + return false; + if (ext4_should_journal_data(inode)) + return false; + return true; +} + /* * The intention here is to start with shared lock acquired then see if any * condition requires an exclusive inode lock. If yes, then we restart the @@ -394,8 +405,8 @@ static const struct iomap_dio_ops ext4_dio_write_ops = { * - For extending writes case we don't take the shared lock, since it requires * updating inode i_disksize and/or orphan handling with exclusive lock. * - * - shared locking will only be true mostly in case of overwrites with - * dioread_nolock mode. Otherwise we will switch to excl. iolock mode. + * - shared locking will only be true mostly in case of overwrites. + * Otherwise we will switch to excl. iolock mode. */ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from, unsigned int *iolock, bool *unaligned_io, @@ -433,15 +444,14 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from, *extend = true; /* * Determine whether the IO operation will overwrite allocated - * and initialized blocks. If so, check to see whether it is - * possible to take the dioread_nolock path. + * and initialized blocks. * * We need exclusive i_rwsem for changing security info * in file_modified(). */ if (*iolock == EXT4_IOLOCK_SHARED && (!IS_NOSEC(inode) || *unaligned_io || *extend || - !ext4_should_dioread_nolock(inode) || + !ext4_dio_should_shared_lock(inode) || !ext4_overwrite_io(inode, offset, count))) { ext4_iunlock(inode, *iolock); *iolock = EXT4_IOLOCK_EXCL; @@ -485,7 +495,10 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) iolock = EXT4_IOLOCK_EXCL; } - if (iolock == EXT4_IOLOCK_SHARED && !ext4_should_dioread_nolock(inode)) + /* + * Check if we should continue with shared iolock + */ + if (iolock == EXT4_IOLOCK_SHARED && !ext4_dio_should_shared_lock(inode)) iolock = EXT4_IOLOCK_EXCL; if (iocb->ki_flags & IOCB_NOWAIT) { -- 2.21.0