2011-01-04 08:53:41

by Daniel Stodden

[permalink] [raw]
Subject: Cache flush question.


Hi anyone.

If somebody's got a sec to enlighten me, there's some phenomenon I
recently came across and found somewhat counterintuitive first.

Whenever I

1. Dirty a bunch of pages backed by an NFS mount on some server.

2. Block the traffic with iptables (TCP, assuming that mattered).
Still plenty of writeback pending.

3. Sync

I see #3 drive the dirty count in /proc/meminfo drop back to
almost-zero, immediately. The sync itself blocks, though.

So the pages are called clean the moment the write got queued, not
acked? Leaving the rest just to retransmits by the socket then? Is this
just done so because one can, or would that order rather matter for
consistency?

Thanks,
Daniel



2011-01-05 23:26:42

by Daniel Stodden

[permalink] [raw]
Subject: Re: Cache flush question.

On Tue, 2011-01-04 at 09:32 -0500, Trond Myklebust wrote:
> On Tue, 2011-01-04 at 00:44 -0800, Daniel Stodden wrote:
> > Hi anyone.
> >
> > If somebody's got a sec to enlighten me, there's some phenomenon I
> > recently came across and found somewhat counterintuitive first.
> >
> > Whenever I
> >
> > 1. Dirty a bunch of pages backed by an NFS mount on some server.
> >
> > 2. Block the traffic with iptables (TCP, assuming that mattered).
> > Still plenty of writeback pending.
> >
> > 3. Sync
> >
> > I see #3 drive the dirty count in /proc/meminfo drop back to
> > almost-zero, immediately. The sync itself blocks, though.
> >
> > So the pages are called clean the moment the write got queued, not
> > acked? Leaving the rest just to retransmits by the socket then? Is this
> > just done so because one can, or would that order rather matter for
> > consistency?
>
> Take a look at the 'Writeback:' count, which should turn non-zero when
> you hit #3.
>
> The VM allows pages to be either dirty or in writeback, but not both at
> the same time. This is not NFS-specific. The same rule applies to local
> filesystems.

Ah. That explains everything. Actually a question then, thanks for the
clarification :)

Rob Landley's comment regarding tx queue size somewhat made a good point
too. But, given the rates I see, this queues mostly cache pages on the
transport, not copies, right?

Thanks.

Daniel


2011-01-04 14:32:39

by Myklebust, Trond

[permalink] [raw]
Subject: Re: Cache flush question.

On Tue, 2011-01-04 at 00:44 -0800, Daniel Stodden wrote:
> Hi anyone.
>
> If somebody's got a sec to enlighten me, there's some phenomenon I
> recently came across and found somewhat counterintuitive first.
>
> Whenever I
>
> 1. Dirty a bunch of pages backed by an NFS mount on some server.
>
> 2. Block the traffic with iptables (TCP, assuming that mattered).
> Still plenty of writeback pending.
>
> 3. Sync
>
> I see #3 drive the dirty count in /proc/meminfo drop back to
> almost-zero, immediately. The sync itself blocks, though.
>
> So the pages are called clean the moment the write got queued, not
> acked? Leaving the rest just to retransmits by the socket then? Is this
> just done so because one can, or would that order rather matter for
> consistency?

Take a look at the 'Writeback:' count, which should turn non-zero when
you hit #3.

The VM allows pages to be either dirty or in writeback, but not both at
the same time. This is not NFS-specific. The same rule applies to local
filesystems.

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2011-01-04 14:00:16

by Rob Landley

[permalink] [raw]
Subject: Re: Cache flush question.

On 01/04/2011 02:44 AM, Daniel Stodden wrote:
>
> Hi anyone.
>
> If somebody's got a sec to enlighten me, there's some phenomenon I
> recently came across and found somewhat counterintuitive first.
>
> Whenever I
>
> 1. Dirty a bunch of pages backed by an NFS mount on some server.
>
> 2. Block the traffic with iptables (TCP, assuming that mattered).
> Still plenty of writeback pending.
>
> 3. Sync
>
> I see #3 drive the dirty count in /proc/meminfo drop back to
> almost-zero, immediately. The sync itself blocks, though.
>
> So the pages are called clean the moment the write got queued, not
> acked? Leaving the rest just to retransmits by the socket then? Is this
> just done so because one can, or would that order rather matter for
> consistency?

At a wild guess, maybe you're experiencing what Jim Gettys dubbed
"buffer bloat".

http://lwn.net/Articles/419714/

Specficially, does ifconfig show a txqueuelen of 1000 for your device?
That means the device is buffering 1000 outbound packets, for no readily
apparent reason (other than to screw up latency). With an MTU of 1500
that's a megabyte and a half of outgoing data constipated in the network
layer.

NFS also has some cacheing of its own I don't understand yet, for
"non-idempotent" transactions. Described in this OLS paper:

http://kernel.org/doc/ols/2006/ols2006v2-pages-59-72.pdf

Rob

2011-01-06 07:33:44

by Rob Landley

[permalink] [raw]
Subject: Re: Cache flush question.

On 01/05/2011 05:15 PM, Daniel Stodden wrote:
> On Tue, 2011-01-04 at 09:32 -0500, Trond Myklebust wrote:
>> On Tue, 2011-01-04 at 00:44 -0800, Daniel Stodden wrote:
>>> Hi anyone.
>>>
>>> If somebody's got a sec to enlighten me, there's some phenomenon I
>>> recently came across and found somewhat counterintuitive first.
>>>
>>> Whenever I
>>>
>>> 1. Dirty a bunch of pages backed by an NFS mount on some server.
>>>
>>> 2. Block the traffic with iptables (TCP, assuming that mattered).
>>> Still plenty of writeback pending.
>>>
>>> 3. Sync
>>>
>>> I see #3 drive the dirty count in /proc/meminfo drop back to
>>> almost-zero, immediately. The sync itself blocks, though.
>>>
>>> So the pages are called clean the moment the write got queued, not
>>> acked? Leaving the rest just to retransmits by the socket then? Is this
>>> just done so because one can, or would that order rather matter for
>>> consistency?
>>
>> Take a look at the 'Writeback:' count, which should turn non-zero when
>> you hit #3.
>>
>> The VM allows pages to be either dirty or in writeback, but not both at
>> the same time. This is not NFS-specific. The same rule applies to local
>> filesystems.
>
> Ah. That explains everything. Actually a question then, thanks for the
> clarification :)
>
> Rob Landley's comment regarding tx queue size somewhat made a good point
> too.

Not nearly as good as I thought at the time.

> But, given the rates I see, this queues mostly cache pages on the
> transport, not copies, right?

Easy enough to tell: there's a "Writeback" field in /proc/meminfo. Add
'em up and see what's missing.

Rob