From: Christoph Hellwig Subject: Re: [PATCH 1/3] xfs: honor the O_SYNC flag for aysnchronous direct I/O requests Date: Sat, 28 Jan 2012 09:59:33 -0500 Message-ID: <20120128145933.GA10931@infradead.org> References: <1327698949-12616-1-git-send-email-jmoyer@redhat.com> <1327698949-12616-2-git-send-email-jmoyer@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com To: Jeff Moyer Return-path: Content-Disposition: inline In-Reply-To: <1327698949-12616-2-git-send-email-jmoyer@redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com List-Id: linux-ext4.vger.kernel.org This looks pretty good. Did this past xfstests? I'd also like to add tests actually executing this code path just, to be sure. E.g. variants of aio-stress actually using O_SYNC. We can't easily test data really made it to disk that way, although at least we make sure the code doesn't break. On Fri, Jan 27, 2012 at 04:15:47PM -0500, Jeff Moyer wrote: > Hi, > > If a file is opened with O_SYNC|O_DIRECT, the drive cache does not get > flushed after the write completion. Instead, it's flushed *before* the > I/O is sent to the disk (in __generic_file_aio_write). XFS doesn't actually use __generic_file_aio_write, so this sentence isn't correct for XFS. > + } else if (xfs_ioend_needs_cache_flush(ioend)) { > + struct xfs_inode *ip = XFS_I(ioend->io_inode); > + struct xfs_mount *mp = ip->i_mount; > + int err; > + int log_flushed = 0; > + > + /* > + * Check to see if we only need to sync data. If so, > + * we can skip the log flush. > + */ > + if (IS_SYNC(ioend->io_inode) || > + (ioend->io_iocb->ki_filp->f_flags & __O_SYNC)) { > + err = _xfs_log_force(mp, XFS_LOG_SYNC, &log_flushed); Can you add a TODO comment that this actually is synchronous and thus will block the I/O completion work queue? Also you can use _xfs_log_force_lsn here as don't need to flush the whole log, just up to the last lsn that touched the inode. Copy, or better factor the code from xfs_dir_fsync for that. Last but not least this won't catch timestamp updates. Given that I'm about to send a series making timestamp updates transaction I would not recommend you to bother with that, but if you want to take a look at how xfs_file_fsync deals with them. Given that this series touches the same area I'd also like to take your xfs patch in through the xfs tree to avoid conflicts. > @@ -47,6 +47,7 @@ STATIC int xfsbufd(void *); > static struct workqueue_struct *xfslogd_workqueue; > struct workqueue_struct *xfsdatad_workqueue; > struct workqueue_struct *xfsconvertd_workqueue; > +struct workqueue_struct *xfsflushd_workqueue; > > #ifdef XFS_BUF_LOCK_TRACKING > # define XB_SET_OWNER(bp) ((bp)->b_last_holder = current->pid) > @@ -1802,8 +1803,15 @@ xfs_buf_init(void) > if (!xfsconvertd_workqueue) > goto out_destroy_xfsdatad_workqueue; > > + xfsflushd_workqueue = alloc_workqueue("xfsflushd", > + WQ_MEM_RECLAIM, 1); This should allow a higher concurrently level, it's probably a good idea to pass 0 and use the default. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs