2011-03-03 19:29:57

by Mingming Cao

[permalink] [raw]
Subject: [PATCH] ext4: Use sing thread to perform DIO unwritten convertion

While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing
conversion from unwritten extents to written for IOs completed from async direct IO patch.
Per filesystem is enough, we don't need per cpu threads to work on conversion.

Signed-off-by: Mingming Cao <[email protected]>
---
fs/ext4/super.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index f6a318f..c76a6a5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3509,7 +3509,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
percpu_counter_set(&sbi->s_dirtyblocks_counter, 0);

no_journal:
- EXT4_SB(sb)->dio_unwritten_wq = create_workqueue("ext4-dio-unwritten");
+ EXT4_SB(sb)->dio_unwritten_wq = create_singlethread_workqueue("ext4-dio-unwritten");
if (!EXT4_SB(sb)->dio_unwritten_wq) {
printk(KERN_ERR "EXT4-fs: failed to create DIO workqueue\n");
goto failed_mount_wq;
--
1.6.3.3





2011-03-06 00:13:32

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: Use sing thread to perform DIO unwritten convertion

On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote:
> While running ext4 testing on multiple core, we found there are per cpu ext4-dio-unwritten threads processing
> conversion from unwritten extents to written for IOs completed from async direct IO patch.
> Per filesystem is enough, we don't need per cpu threads to work on conversion.
>
> Signed-off-by: Mingming Cao <[email protected]>

Thanks, added to the ext4 patch queue.

- Ted

2011-03-06 00:13:30

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: Use single thread to perform DIO unwritten convertion

On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote:
> While running ext4 testing on multiple core, we found there are per
> cpu ext4-dio-unwritten threads processing conversion from unwritten
> extents to written for IOs completed from async direct IO patch.
> Per filesystem is enough, we don't need per cpu threads to work on
> conversion.
>
> Signed-off-by: Mingming Cao <[email protected]>

Eric, would you be able to do a very quick sanity check on your
48-core machine? I can definitely see how having a huge number of
threads per file system could be problematic, especially on a system
with 32 or 64 ext4 file systems. I'm curious though if we'll end up
taking a performance hit on direct I/O workloads.

If I remember correctly we currently have large file create with DIO
turned off, right? Would it be possible to do a large file create
with DIO enabled, and do a quick run both with and without this patch?

In the future it would also be interesting to see how we are doing
versus other file systems using a DIO workload. This is a probably
another area where I suspect some lockstat and oprofile runs may give
us opportunities for further optimization.

- Ted

2011-03-07 15:47:22

by Eric Whitney

[permalink] [raw]
Subject: Re: [PATCH] ext4: Use single thread to perform DIO unwritten convertion



On 03/05/2011 12:46 PM, Ted Ts'o wrote:
> On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote:
>> While running ext4 testing on multiple core, we found there are per
>> cpu ext4-dio-unwritten threads processing conversion from unwritten
>> extents to written for IOs completed from async direct IO patch.
>> Per filesystem is enough, we don't need per cpu threads to work on
>> conversion.
>>
>> Signed-off-by: Mingming Cao<[email protected]>
>
> Eric, would you be able to do a very quick sanity check on your
> 48-core machine? I can definitely see how having a huge number of
> threads per file system could be problematic, especially on a system
> with 32 or 64 ext4 file systems. I'm curious though if we'll end up
> taking a performance hit on direct I/O workloads.
>

Hi Ted:

Sure, I can do that - I'll queue it up once I'm done with the "for .39"
patch measurements.

> If I remember correctly we currently have large file create with DIO
> turned off, right? Would it be possible to do a large file create
> with DIO enabled, and do a quick run both with and without this patch?

That's right, we're not measuring DIO right now. I think I've got
enough hardware to run a filesystem per core (or more), and I think it
should be straightforward to write a modified ffsb profile to run (say)
48 filesystems in parallel.

>
> In the future it would also be interesting to see how we are doing
> versus other file systems using a DIO workload. This is a probably
> another area where I suspect some lockstat and oprofile runs may give
> us opportunities for further optimization.

Yes - as discussed at Plumber's. I'll put that on the list as well.
With luck, there should be some time towards the end of the .39 merge
window.

Eric

>
> - Ted

2011-03-08 01:40:55

by Mingming Cao

[permalink] [raw]
Subject: Re: [PATCH] ext4: Use single thread to perform DIO unwritten convertion

On Sat, 2011-03-05 at 12:46 -0500, Ted Ts'o wrote:
> On Thu, Mar 03, 2011 at 11:29:54AM -0800, Mingming Cao wrote:
> > While running ext4 testing on multiple core, we found there are per
> > cpu ext4-dio-unwritten threads processing conversion from unwritten
> > extents to written for IOs completed from async direct IO patch.
> > Per filesystem is enough, we don't need per cpu threads to work on
> > conversion.
> >
> > Signed-off-by: Mingming Cao <[email protected]>
>
> Eric, would you be able to do a very quick sanity check on your
> 48-core machine? I can definitely see how having a huge number of
> threads per file system could be problematic, especially on a system
> with 32 or 64 ext4 file systems. I'm curious though if we'll end up
> taking a performance hit on direct I/O workloads.
>
> If I remember correctly we currently have large file create with DIO
> turned off, right? Would it be possible to do a large file create
> with DIO enabled, and do a quick run both with and without this patch?
>
The background thread performs the conversion when IOs from async dio
writing to holes/preallocated is completed. So would need to setup
fallocated files and running async and direct IO would possible to
exercise any potential scalability issue with the background dio
conversion thread...

I took a look at FFSB, it doesn't support fallocate and async IO yet.
But fio does support aio and fallocate. This is a simple fio profile I
use for test file being setup by fallocate() and run random aio dio over
it. See it is useful for Eric to give it a try or a reference on his 48
core.

examples$ cat aio-setup
; Random read/write to fallocat files with aio dio
[global]
ioengine=libaio
direct=1
rw=randrw
bs=4k
size=2m
filesize=1024m
fallocate=1
directory=/tmp

[file1]
iodepth=4

> In the future it would also be interesting to see how we are doing
> versus other file systems using a DIO workload. This is a probably
> another area where I suspect some lockstat and oprofile runs may give
> us opportunities for further optimization.
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html