2014-06-04 19:03:21

by Iozone

[permalink] [raw]
Subject: FW: Forwarding request at suggestion from support



From: Iozone [mailto:[email protected]]
Sent: Wednesday, June 04, 2014 11:39 AM
To: [email protected]
Subject: Forwarding request at suggestion from support

??????????????? Dear kernel folks,

??????????????????????????????? Please take a look at Bugzilla bug:
???????????????????????????????
https://bugzilla.redhat.com/show_bug.cgi?id=1104696

Description of problem:
?
?? Linux NFSv3 clients can issue extra reads beyond EOF.
?
Condition of the test:? (32KB_file is a file that is 32KB in size)
????????? File is being read over an NFSv3 mount.
?
????????? dd if=/mnt/32KB_file? of=/dev/null iflag=direct bs=1M count=1
?
What one should expect over the wire:
????????? NFSv3_read for 32k, or NFS_read for 1M
????????? NFSv3_read Reply return of 32KB and EOF set.
?
What happens with Linux NFSv3 client:
????????? NFSv3 read for 128k
????????? NFSv3 read for 128k,
??????????NFSv3 read for 128k,
??????????NFSv3 read for 128k,
????????? NFSv3 read for 128k,
??????????NFSv3 read for 128k,
??????????NFSv3 read for 128k,
??????????NFSv3 read for 128k.
?????? followed by:
????????? NFSv3 read reply of 32k,
??????????NFSv3 read reply of 0,
??????????NFSv3 read reply of 0,
??????????NFSv3 read reply of 0,
????????? NFSv3 read reply of 0,?
??????????NFSv3 read reply of 0,
??????????NFSv3 read reply of 0,
??????????NFSv3 read reply of 0.
?
So? instead of a single round trip with a short read length returned, there
were 8 async I/O ops sent to the NFS server, and 8 replies from the NFS
server.?
The client knew the file size before even sending the very first request,
but
went ahead and issued an large number of reads that it should have known
were
beyond EOF.
?
This client behavior hammers NFS servers with requests that are guaranteed
to always fail, and burn
CPU cycles, for operations that it knew were pointless.
?
While the application is getting correct answers to the API calls, the poor
client and server are beating each other senseless over the wire.
?
NOTE: This only happens if O_DIRECT is being used? (thus the iflag=direct)
??
Version-Release number of selected component (if applicable):
?? Every version of Linux that I have tested seems to do this...
?
How reproducible:
?? Extremely. Though the actual transfer size being sent by the Linux
?? client is sometimes a bunch of 32k transfers, or 128k or 256k, depending
?? on the version of the kernel, and the NFS mount block size.
?
?? A worse case is app does 1MB read (with O_DIRECT), with NFS block size
???set to 32K, the number of async reads is 1 useful and 31 that were
???beyond EOF.? So for each 32k of file data, there are 62 extra NFS
???messages between the client and the server, and only 2 messages that
???made sense or ever should have taken place.? ( IMHO )
?
Steps to Reproduce:
? The steps are above.
?
In the attachment, the file size is 32k, the NFS block size is 32k. So you
can see all of the extra async (back to back) client requests that are
all going to return zero, except the very first one.
?
???????????????
----------------------------------------------------------------------------
-----------------------------------

??????????????? More details of why this is an issue:
?
??????????????? ?? In the SPECsfs2014 benchmark (under development) there is
a workload that
??????????????? simulates a software build environment.? There are billions
of small files.? One of the
??????????????? operations that is tested is to read an entire file.? This
operation uses? a large transfer
??????????????? size so that it can read the files as efficiently as
possible.?? Whenever the file size is smaller
??????????????? than this large transfer size, the NFS client issues 8 to 64
times as many I/Os
??????????????? as were necessary to read the small file.?
??????????????? ???If you take into consideration that this benchmark is
simulating hundreds, or
??????????????? thousands of users, with billions of files, and is using
multiple client nodes
??????????????? to present load to the server under test?. This overshoot on
the reads is
??????????????? burying the NFS server with work that should never have been
sent to
??????????????? the server.? Instead of the benchmark measuring how fast the
NFS server
??????????????? can serve files, it becomes a test of how many insane
requests beyond
??????????????? EOF can the server tolerate from a Linux NFSv3 client, while
serving almost
?????????????? no file data at all.?
?
??????????????? ?? I don?t understand why the Linux NFS client would ever
issue reads beyond
??????????????? EOF.??? These files were opened, (LOOKUP, GETATTR, ACCESS)
so the Linux
??????????????? kernel knows the file size.? It should be a one line change
to the code to simply
??????????????? *not* issue async read-aheads for file data that is beyond
EOF.?

AND?.

??????????????? The Linux NFS client code certainly appears to know how to
not read beyond EOF when
?????????????? the O_DIRECT flag is off.? Why is this not the same when the
O_DIRECT flag is on ??

Thank you,
Don Capps
Capps at iozone dot org

P.S.? This overshoot is also confirmed to be happening in the NFSv4 client
code.




2014-06-04 20:42:09

by Trond Myklebust

[permalink] [raw]
Subject: Re: FW: Forwarding request at suggestion from support

Hi Don,

On Wed, Jun 4, 2014 at 2:02 PM, Iozone <[email protected]> wrote:
>
>
> From: Iozone [mailto:[email protected]]
> Sent: Wednesday, June 04, 2014 11:39 AM
> To: [email protected]
> Subject: Forwarding request at suggestion from support
>
> Dear kernel folks,
>
> Please take a look at Bugzilla bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1104696
>
> Description of problem:
>
> Linux NFSv3 clients can issue extra reads beyond EOF.
>
> Condition of the test: (32KB_file is a file that is 32KB in size)
> File is being read over an NFSv3 mount.
>
> dd if=/mnt/32KB_file of=/dev/null iflag=direct bs=1M count=1
>
> What one should expect over the wire:
> NFSv3_read for 32k, or NFS_read for 1M
> NFSv3_read Reply return of 32KB and EOF set.
>
> What happens with Linux NFSv3 client:
> NFSv3 read for 128k
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k.
> followed by:
> NFSv3 read reply of 32k,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0.
>
> So… instead of a single round trip with a short read length returned, there
> were 8 async I/O ops sent to the NFS server, and 8 replies from the NFS
> server.
> The client knew the file size before even sending the very first request,
> but
> went ahead and issued an large number of reads that it should have known
> were
> beyond EOF.
>
> This client behavior hammers NFS servers with requests that are guaranteed
> to always fail, and burn
> CPU cycles, for operations that it knew were pointless.
>
> While the application is getting correct answers to the API calls, the poor
> client and server are beating each other senseless over the wire.
>
> NOTE: This only happens if O_DIRECT is being used… (thus the iflag=direct)

Yes. This behaviour is intentional in the case of O_DIRECT. The reason
why we should not change it is that we don't ever want to rely on
cached values for the file size when doing uncached I/O.
An application such as Oracle may have out-of-band information about
writes to the file that were made by another client directly to the
server, in which case it would be wrong for the kernel to truncate
those reads based on its cached information.

Cheers
Trond

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-06-04 21:14:37

by Trond Myklebust

[permalink] [raw]
Subject: Re: FW: Forwarding request at suggestion from support

On Wed, Jun 4, 2014 at 5:03 PM, Iozone <[email protected]> wrote:
> Trond,
>
> Ok... but as the replies are coming back, all but one with EOF and zero bytes
> transferred, does it still make sense to keep issuing reads that are beyond EOF ?

It depends. The reads should all be sent asynchronously, so it isn't
clear to me that the client will see the EOF until all the RPC
requests are in flight.

That said, it is true that we do not have any machinery right now to
stop further submissions if we see that we have already collected
enough information to complete the read() syscall. Are there any good
use cases for O_DIRECT that justify adding such machinery? Oracle
doesn't seem to need it.

Cheers
Trond

> Enjoy,
> Don Capps
>
> -----Original Message-----
> From: Trond Myklebust [mailto:[email protected]]
> Sent: Wednesday, June 04, 2014 3:42 PM
> To: [email protected]
> Cc: Linux NFS Mailing List
> Subject: Re: FW: Forwarding request at suggestion from support
>
> Hi Don,
>
> On Wed, Jun 4, 2014 at 2:02 PM, Iozone <[email protected]> wrote:
>>
>>
>> From: Iozone [mailto:[email protected]]
>> Sent: Wednesday, June 04, 2014 11:39 AM
>> To: [email protected]
>> Subject: Forwarding request at suggestion from support
>>
>> Dear kernel folks,
>>
>> Please take a look at Bugzilla bug:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1104696
>>
>> Description of problem:
>>
>> Linux NFSv3 clients can issue extra reads beyond EOF.
>>
>> Condition of the test: (32KB_file is a file that is 32KB in size)
>> File is being read over an NFSv3 mount.
>>
>> dd if=/mnt/32KB_file of=/dev/null iflag=direct bs=1M
>> count=1
>>
>> What one should expect over the wire:
>> NFSv3_read for 32k, or NFS_read for 1M
>> NFSv3_read Reply return of 32KB and EOF set.
>>
>> What happens with Linux NFSv3 client:
>> NFSv3 read for 128k
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k.
>> followed by:
>> NFSv3 read reply of 32k,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0.
>>
>> So… instead of a single round trip with a short read length returned,
>> there were 8 async I/O ops sent to the NFS server, and 8 replies from
>> the NFS server.
>> The client knew the file size before even sending the very first
>> request, but went ahead and issued an large number of reads that it
>> should have known were beyond EOF.
>>
>> This client behavior hammers NFS servers with requests that are
>> guaranteed to always fail, and burn CPU cycles, for operations that it
>> knew were pointless.
>>
>> While the application is getting correct answers to the API calls, the
>> poor client and server are beating each other senseless over the wire.
>>
>> NOTE: This only happens if O_DIRECT is being used… (thus the
>> iflag=direct)
>
> Yes. This behaviour is intentional in the case of O_DIRECT. The reason why we should not change it is that we don't ever want to rely on cached values for the file size when doing uncached I/O.
> An application such as Oracle may have out-of-band information about writes to the file that were made by another client directly to the server, in which case it would be wrong for the kernel to truncate those reads based on its cached information.
>
> Cheers
> Trond
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]
>



--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-06-04 21:56:34

by Trond Myklebust

[permalink] [raw]
Subject: Re: FW: Forwarding request at suggestion from support

On Wed, Jun 4, 2014 at 5:36 PM, Iozone <[email protected]> wrote:
> Trond,
>
> I have traces where there are indeed a bunch of async reads issued, and
> the replies come back. One with data, and all of the rest with zero bytes
> transferred, indicating EOF. This was followed by a bunch more async
> reads, all of which come back with zero bytes transferred. It appears
> that if the user requested 16MB, and the file was 4k, then there will
> be 16MB of transfers issued regardless of the fact that all but one
> are returning zero bytes....
>
>
> Business case:
> This not only could this impact benchmarks... but it also has the potential
> of opening a door for a DOS type attack on an NFS server. All it would take
> is one small file, and a bunch of clients going after 1GB reads on that file
> with O_DIRECT, and the poor NFS server is going to get slammed with
> requests at a phenomenal rate (as the client is issuing these back-to-back
> async, and the server is responding with back-to-back zero length
> transfer replies). The client burns very little CPU, and the NFS server
> is buried, doing zero length transfers... pretty much in a very tight loop....

Sorry, but no, that's not convincing. There are plenty of things you
can do using NFS to force the server to do unnecessary work. Fire up
1000 threads and your one client can slam it with stat() calls,
open(), readdir, or anything else you care to name. The server can and
should throttle the TCP connection if it wants to push back on a
particular client to slow down the rate.

As I indicated earlier, the main question here is what is the value of
this functionality to specific applications that need to use O_DIRECT.


> Thank you,
> Don Capps
>
>
> -----Original Message-----
> From: Trond Myklebust [mailto:[email protected]]
> Sent: Wednesday, June 04, 2014 4:15 PM
> To: [email protected]
> Cc: Linux NFS Mailing List
> Subject: Re: FW: Forwarding request at suggestion from support
>
> On Wed, Jun 4, 2014 at 5:03 PM, Iozone <[email protected]> wrote:
>> Trond,
>>
>> Ok... but as the replies are coming back, all but one with EOF and zero bytes
>> transferred, does it still make sense to keep issuing reads that are beyond EOF ?
>
> It depends. The reads should all be sent asynchronously, so it isn't clear to me that the client will see the EOF until all the RPC requests are in flight.
>
> That said, it is true that we do not have any machinery right now to stop further submissions if we see that we have already collected enough information to complete the read() syscall. Are there any good use cases for O_DIRECT that justify adding such machinery? Oracle doesn't seem to need it.
>
> Cheers
> Trond
>
>> Enjoy,
>> Don Capps
>>
>> -----Original Message-----
>> From: Trond Myklebust [mailto:[email protected]]
>> Sent: Wednesday, June 04, 2014 3:42 PM
>> To: [email protected]
>> Cc: Linux NFS Mailing List
>> Subject: Re: FW: Forwarding request at suggestion from support
>>
>> Hi Don,
>>
>> On Wed, Jun 4, 2014 at 2:02 PM, Iozone <[email protected]> wrote:
>>>
>>>
>>> From: Iozone [mailto:[email protected]]
>>> Sent: Wednesday, June 04, 2014 11:39 AM
>>> To: [email protected]
>>> Subject: Forwarding request at suggestion from support
>>>
>>> Dear kernel folks,
>>>
>>> Please take a look at Bugzilla bug:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1104696
>>>
>>> Description of problem:
>>>
>>> Linux NFSv3 clients can issue extra reads beyond EOF.
>>>
>>> Condition of the test: (32KB_file is a file that is 32KB in size)
>>> File is being read over an NFSv3 mount.
>>>
>>> dd if=/mnt/32KB_file of=/dev/null iflag=direct bs=1M
>>> count=1
>>>
>>> What one should expect over the wire:
>>> NFSv3_read for 32k, or NFS_read for 1M
>>> NFSv3_read Reply return of 32KB and EOF set.
>>>
>>> What happens with Linux NFSv3 client:
>>> NFSv3 read for 128k
>>> NFSv3 read for 128k,
>>> NFSv3 read for 128k,
>>> NFSv3 read for 128k,
>>> NFSv3 read for 128k,
>>> NFSv3 read for 128k,
>>> NFSv3 read for 128k,
>>> NFSv3 read for 128k.
>>> followed by:
>>> NFSv3 read reply of 32k,
>>> NFSv3 read reply of 0,
>>> NFSv3 read reply of 0,
>>> NFSv3 read reply of 0,
>>> NFSv3 read reply of 0,
>>> NFSv3 read reply of 0,
>>> NFSv3 read reply of 0,
>>> NFSv3 read reply of 0.
>>>
>>> So… instead of a single round trip with a short read length returned,
>>> there were 8 async I/O ops sent to the NFS server, and 8 replies from
>>> the NFS server.
>>> The client knew the file size before even sending the very first
>>> request, but went ahead and issued an large number of reads that it
>>> should have known were beyond EOF.
>>>
>>> This client behavior hammers NFS servers with requests that are
>>> guaranteed to always fail, and burn CPU cycles, for operations that
>>> it knew were pointless.
>>>
>>> While the application is getting correct answers to the API calls,
>>> the poor client and server are beating each other senseless over the wire.
>>>
>>> NOTE: This only happens if O_DIRECT is being used… (thus the
>>> iflag=direct)
>>
>> Yes. This behaviour is intentional in the case of O_DIRECT. The reason why we should not change it is that we don't ever want to rely on cached values for the file size when doing uncached I/O.
>> An application such as Oracle may have out-of-band information about writes to the file that were made by another client directly to the server, in which case it would be wrong for the kernel to truncate those reads based on its cached information.
>>
>> Cheers
>> Trond
>>
>> --
>> Trond Myklebust
>>
>> Linux NFS client maintainer, PrimaryData
>>
>> [email protected]
>>
>
>
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]
>



--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]

2014-06-04 22:04:02

by Iozone

[permalink] [raw]
Subject: RE: FW: Forwarding request at suggestion from support

Trond,

Ok... but as the replies are coming back, all but one with EOF and zero bytes
transferred, does it still make sense to keep issuing reads that are beyond EOF ?

Enjoy,
Don Capps

-----Original Message-----
From: Trond Myklebust [mailto:[email protected]]
Sent: Wednesday, June 04, 2014 3:42 PM
To: [email protected]
Cc: Linux NFS Mailing List
Subject: Re: FW: Forwarding request at suggestion from support

Hi Don,

On Wed, Jun 4, 2014 at 2:02 PM, Iozone <[email protected]> wrote:
>
>
> From: Iozone [mailto:[email protected]]
> Sent: Wednesday, June 04, 2014 11:39 AM
> To: [email protected]
> Subject: Forwarding request at suggestion from support
>
> Dear kernel folks,
>
> Please take a look at Bugzilla bug:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1104696
>
> Description of problem:
>
> Linux NFSv3 clients can issue extra reads beyond EOF.
>
> Condition of the test: (32KB_file is a file that is 32KB in size)
> File is being read over an NFSv3 mount.
>
> dd if=/mnt/32KB_file of=/dev/null iflag=direct bs=1M
> count=1
>
> What one should expect over the wire:
> NFSv3_read for 32k, or NFS_read for 1M
> NFSv3_read Reply return of 32KB and EOF set.
>
> What happens with Linux NFSv3 client:
> NFSv3 read for 128k
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k,
> NFSv3 read for 128k.
> followed by:
> NFSv3 read reply of 32k,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0,
> NFSv3 read reply of 0.
>
> So… instead of a single round trip with a short read length returned,
> there were 8 async I/O ops sent to the NFS server, and 8 replies from
> the NFS server.
> The client knew the file size before even sending the very first
> request, but went ahead and issued an large number of reads that it
> should have known were beyond EOF.
>
> This client behavior hammers NFS servers with requests that are
> guaranteed to always fail, and burn CPU cycles, for operations that it
> knew were pointless.
>
> While the application is getting correct answers to the API calls, the
> poor client and server are beating each other senseless over the wire.
>
> NOTE: This only happens if O_DIRECT is being used… (thus the
> iflag=direct)

Yes. This behaviour is intentional in the case of O_DIRECT. The reason why we should not change it is that we don't ever want to rely on cached values for the file size when doing uncached I/O.
An application such as Oracle may have out-of-band information about writes to the file that were made by another client directly to the server, in which case it would be wrong for the kernel to truncate those reads based on its cached information.

Cheers
Trond

--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]


2014-06-04 21:36:15

by Iozone

[permalink] [raw]
Subject: RE: FW: Forwarding request at suggestion from support

Trond,

I have traces where there are indeed a bunch of async reads issued, and
the replies come back. One with data, and all of the rest with zero bytes
transferred, indicating EOF. This was followed by a bunch more async
reads, all of which come back with zero bytes transferred. It appears
that if the user requested 16MB, and the file was 4k, then there will
be 16MB of transfers issued regardless of the fact that all but one
are returning zero bytes....

Business case:
This not only could this impact benchmarks... but it also has the potential
of opening a door for a DOS type attack on an NFS server. All it would take
is one small file, and a bunch of clients going after 1GB reads on that file
with O_DIRECT, and the poor NFS server is going to get slammed with
requests at a phenomenal rate (as the client is issuing these back-to-back
async, and the server is responding with back-to-back zero length
transfer replies). The client burns very little CPU, and the NFS server
is buried, doing zero length transfers... pretty much in a very tight loop....

Thank you,
Don Capps


-----Original Message-----
From: Trond Myklebust [mailto:[email protected]]
Sent: Wednesday, June 04, 2014 4:15 PM
To: [email protected]
Cc: Linux NFS Mailing List
Subject: Re: FW: Forwarding request at suggestion from support

On Wed, Jun 4, 2014 at 5:03 PM, Iozone <[email protected]> wrote:
> Trond,
>
> Ok... but as the replies are coming back, all but one with EOF and zero bytes
> transferred, does it still make sense to keep issuing reads that are beyond EOF ?

It depends. The reads should all be sent asynchronously, so it isn't clear to me that the client will see the EOF until all the RPC requests are in flight.

That said, it is true that we do not have any machinery right now to stop further submissions if we see that we have already collected enough information to complete the read() syscall. Are there any good use cases for O_DIRECT that justify adding such machinery? Oracle doesn't seem to need it.

Cheers
Trond

> Enjoy,
> Don Capps
>
> -----Original Message-----
> From: Trond Myklebust [mailto:[email protected]]
> Sent: Wednesday, June 04, 2014 3:42 PM
> To: [email protected]
> Cc: Linux NFS Mailing List
> Subject: Re: FW: Forwarding request at suggestion from support
>
> Hi Don,
>
> On Wed, Jun 4, 2014 at 2:02 PM, Iozone <[email protected]> wrote:
>>
>>
>> From: Iozone [mailto:[email protected]]
>> Sent: Wednesday, June 04, 2014 11:39 AM
>> To: [email protected]
>> Subject: Forwarding request at suggestion from support
>>
>> Dear kernel folks,
>>
>> Please take a look at Bugzilla bug:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1104696
>>
>> Description of problem:
>>
>> Linux NFSv3 clients can issue extra reads beyond EOF.
>>
>> Condition of the test: (32KB_file is a file that is 32KB in size)
>> File is being read over an NFSv3 mount.
>>
>> dd if=/mnt/32KB_file of=/dev/null iflag=direct bs=1M
>> count=1
>>
>> What one should expect over the wire:
>> NFSv3_read for 32k, or NFS_read for 1M
>> NFSv3_read Reply return of 32KB and EOF set.
>>
>> What happens with Linux NFSv3 client:
>> NFSv3 read for 128k
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k,
>> NFSv3 read for 128k.
>> followed by:
>> NFSv3 read reply of 32k,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0,
>> NFSv3 read reply of 0.
>>
>> So… instead of a single round trip with a short read length returned,
>> there were 8 async I/O ops sent to the NFS server, and 8 replies from
>> the NFS server.
>> The client knew the file size before even sending the very first
>> request, but went ahead and issued an large number of reads that it
>> should have known were beyond EOF.
>>
>> This client behavior hammers NFS servers with requests that are
>> guaranteed to always fail, and burn CPU cycles, for operations that
>> it knew were pointless.
>>
>> While the application is getting correct answers to the API calls,
>> the poor client and server are beating each other senseless over the wire.
>>
>> NOTE: This only happens if O_DIRECT is being used… (thus the
>> iflag=direct)
>
> Yes. This behaviour is intentional in the case of O_DIRECT. The reason why we should not change it is that we don't ever want to rely on cached values for the file size when doing uncached I/O.
> An application such as Oracle may have out-of-band information about writes to the file that were made by another client directly to the server, in which case it would be wrong for the kernel to truncate those reads based on its cached information.
>
> Cheers
> Trond
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> [email protected]
>



--
Trond Myklebust

Linux NFS client maintainer, PrimaryData

[email protected]