2012-05-03 15:34:07

by Niels de Vos

[permalink] [raw]
Subject: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

When an application on an NFS-client (tested with NFSv3) executes the
following steps, data written after the close() is never flushed to the
server:

1. open()
2. mmap()
3. close()
4. <modify data in the mmap'ed area>
5. munmap()

Dropping the caches (via /proc/sys/vm/drop_caches) or unmounting does not
result in the data being sent to the server.

The man-page for mmap (man 2 mmap) does mention that closing the file-
descriptor does not munmap() the area. Using the mmap'ed area after a
close() sound valid to me (even if it may be bad practice).

Investigation and checking showed that the NFS-client does not handle
munmap(), and only flushes on close(). To solve this problem, least two
solutions can be proposed:

a. f_ops->release() is called on munmap() as well as on close(),
therefore release() can be used to flush data as well.
b. In the 'struct vm_operations_struct' add a .close to the
'struct vm_area_struct' on calling mmap()/nfs_file_mmap() and flush
the data in the new close() function.

Solution a. contains currently very few code changes:

--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -713,6 +713,8 @@ int nfs_open(struct inode *inode, struct file *filp)

int nfs_release(struct inode *inode, struct file *filp)
{
+ if (S_ISREG(inode->i_mode) && inode->i_mapping->nrpages != 0) {
+ nfs_sync_mapping(inode->i_mapping);
nfs_file_clear_open_context(filp);
return 0;
}

The disadvantage is, that nfs_release() is called on close() too. That
means this causes a flushing of dirty pages, and just after that the
nfs_file_clear_open_context() might flush again. The advantage is that
it is possible (though not done at the moment) to return an error in
case flushing failed.

Solution b. does not provide an option to return an error, but does not
get called on each close():

--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -547,9 +547,17 @@ out:
return ret;
}

+static void nfs_vm_close(struct vm_area_struct * vma)
+{
+ struct file *filp = vma->vm_file;
+
+ nfs_file_flush(filp, (fl_owner_t)filp);
+}
+
static const struct vm_operations_struct nfs_file_vm_ops = {
.fault = filemap_fault,
.page_mkwrite = nfs_vm_page_mkwrite,
+ .close = nfs_vm_close,
};

static int nfs_need_sync_write(struct file *filp, struct inode *inode)

I would like some feedback on what solution is most acceptable, or any
other suggestions.

Many thanks,
Niels


2012-05-03 15:43:15

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

T24gVGh1LCAyMDEyLTA1LTAzIGF0IDE3OjM0ICswMjAwLCBOaWVscyBkZSBWb3Mgd3JvdGU6DQo+
IFdoZW4gYW4gYXBwbGljYXRpb24gb24gYW4gTkZTLWNsaWVudCAodGVzdGVkIHdpdGggTkZTdjMp
IGV4ZWN1dGVzIHRoZQ0KPiBmb2xsb3dpbmcgc3RlcHMsIGRhdGEgd3JpdHRlbiBhZnRlciB0aGUg
Y2xvc2UoKSBpcyBuZXZlciBmbHVzaGVkIHRvIHRoZQ0KPiBzZXJ2ZXI6DQo+IA0KPiAxLiBvcGVu
KCkNCj4gMi4gbW1hcCgpDQo+IDMuIGNsb3NlKCkNCj4gNC4gPG1vZGlmeSBkYXRhIGluIHRoZSBt
bWFwJ2VkIGFyZWE+DQo+IDUuIG11bm1hcCgpDQo+IA0KPiBEcm9wcGluZyB0aGUgY2FjaGVzICh2
aWEgL3Byb2Mvc3lzL3ZtL2Ryb3BfY2FjaGVzKSBvciB1bm1vdW50aW5nIGRvZXMgbm90DQo+IHJl
c3VsdCBpbiB0aGUgZGF0YSBiZWluZyBzZW50IHRvIHRoZSBzZXJ2ZXIuDQo+IA0KPiBUaGUgbWFu
LXBhZ2UgZm9yIG1tYXAgKG1hbiAyIG1tYXApIGRvZXMgbWVudGlvbiB0aGF0IGNsb3NpbmcgdGhl
IGZpbGUtDQo+IGRlc2NyaXB0b3IgZG9lcyBub3QgbXVubWFwKCkgdGhlIGFyZWEuIFVzaW5nIHRo
ZSBtbWFwJ2VkIGFyZWEgYWZ0ZXIgYQ0KPiBjbG9zZSgpIHNvdW5kIHZhbGlkIHRvIG1lIChldmVu
IGlmIGl0IG1heSBiZSBiYWQgcHJhY3RpY2UpLg0KPiANCj4gSW52ZXN0aWdhdGlvbiBhbmQgY2hl
Y2tpbmcgc2hvd2VkIHRoYXQgdGhlIE5GUy1jbGllbnQgZG9lcyBub3QgaGFuZGxlDQo+IG11bm1h
cCgpLCBhbmQgb25seSBmbHVzaGVzIG9uIGNsb3NlKCkuIFRvIHNvbHZlIHRoaXMgcHJvYmxlbSwg
bGVhc3QgdHdvDQo+IHNvbHV0aW9ucyBjYW4gYmUgcHJvcG9zZWQ6DQo+IA0KPiBhLiBmX29wcy0+
cmVsZWFzZSgpIGlzIGNhbGxlZCBvbiBtdW5tYXAoKSBhcyB3ZWxsIGFzIG9uIGNsb3NlKCksDQo+
ICAgICB0aGVyZWZvcmUgcmVsZWFzZSgpIGNhbiBiZSB1c2VkIHRvIGZsdXNoIGRhdGEgYXMgd2Vs
bC4NCj4gYi4gSW4gdGhlICdzdHJ1Y3Qgdm1fb3BlcmF0aW9uc19zdHJ1Y3QnIGFkZCBhIC5jbG9z
ZSB0byB0aGUNCj4gICAgICdzdHJ1Y3Qgdm1fYXJlYV9zdHJ1Y3QnIG9uIGNhbGxpbmcgbW1hcCgp
L25mc19maWxlX21tYXAoKSBhbmQgZmx1c2gNCj4gICAgIHRoZSBkYXRhIGluIHRoZSBuZXcgY2xv
c2UoKSBmdW5jdGlvbi4NCj4gDQo+IFNvbHV0aW9uIGEuIGNvbnRhaW5zIGN1cnJlbnRseSB2ZXJ5
IGZldyBjb2RlIGNoYW5nZXM6DQo+IA0KPiAtLS0gYS9mcy9uZnMvaW5vZGUuYw0KPiArKysgYi9m
cy9uZnMvaW5vZGUuYw0KPiBAQCAtNzEzLDYgKzcxMyw4IEBAIGludCBuZnNfb3BlbihzdHJ1Y3Qg
aW5vZGUgKmlub2RlLCBzdHJ1Y3QgZmlsZSAqZmlscCkNCj4gDQo+ICAgaW50IG5mc19yZWxlYXNl
KHN0cnVjdCBpbm9kZSAqaW5vZGUsIHN0cnVjdCBmaWxlICpmaWxwKQ0KPiAgIHsNCj4gKyAgICAg
ICBpZiAoU19JU1JFRyhpbm9kZS0+aV9tb2RlKSAmJiBpbm9kZS0+aV9tYXBwaW5nLT5ucnBhZ2Vz
ICE9IDApIHsNCj4gKyAgICAgICAgICAgICAgIG5mc19zeW5jX21hcHBpbmcoaW5vZGUtPmlfbWFw
cGluZyk7DQo+ICAgICAgICAgIG5mc19maWxlX2NsZWFyX29wZW5fY29udGV4dChmaWxwKTsNCj4g
ICAgICAgICAgcmV0dXJuIDA7DQo+ICAgfQ0KPiANCj4gVGhlIGRpc2FkdmFudGFnZSBpcywgdGhh
dCBuZnNfcmVsZWFzZSgpIGlzIGNhbGxlZCBvbiBjbG9zZSgpIHRvby4gVGhhdA0KPiBtZWFucyB0
aGlzIGNhdXNlcyBhIGZsdXNoaW5nIG9mIGRpcnR5IHBhZ2VzLCBhbmQganVzdCBhZnRlciB0aGF0
IHRoZQ0KPiBuZnNfZmlsZV9jbGVhcl9vcGVuX2NvbnRleHQoKSBtaWdodCBmbHVzaCBhZ2Fpbi4g
VGhlIGFkdmFudGFnZSBpcyB0aGF0DQo+IGl0IGlzIHBvc3NpYmxlICh0aG91Z2ggbm90IGRvbmUg
YXQgdGhlIG1vbWVudCkgdG8gcmV0dXJuIGFuIGVycm9yIGluDQo+IGNhc2UgZmx1c2hpbmcgZmFp
bGVkLg0KPiANCj4gU29sdXRpb24gYi4gZG9lcyBub3QgcHJvdmlkZSBhbiBvcHRpb24gdG8gcmV0
dXJuIGFuIGVycm9yLCBidXQgZG9lcyBub3QNCj4gZ2V0IGNhbGxlZCBvbiBlYWNoIGNsb3NlKCk6
DQo+IA0KPiAtLS0gYS9mcy9uZnMvZmlsZS5jDQo+ICsrKyBiL2ZzL25mcy9maWxlLmMNCj4gQEAg
LTU0Nyw5ICs1NDcsMTcgQEAgb3V0Og0KPiAgIAlyZXR1cm4gcmV0Ow0KPiAgIH0NCj4gDQo+ICtz
dGF0aWMgdm9pZCBuZnNfdm1fY2xvc2Uoc3RydWN0IHZtX2FyZWFfc3RydWN0ICogdm1hKQ0KPiAr
ew0KPiArCXN0cnVjdCBmaWxlICpmaWxwID0gdm1hLT52bV9maWxlOw0KPiArDQo+ICsJbmZzX2Zp
bGVfZmx1c2goZmlscCwgKGZsX293bmVyX3QpZmlscCk7DQo+ICt9DQo+ICsNCj4gICBzdGF0aWMg
Y29uc3Qgc3RydWN0IHZtX29wZXJhdGlvbnNfc3RydWN0IG5mc19maWxlX3ZtX29wcyA9IHsNCj4g
ICAJLmZhdWx0ID0gZmlsZW1hcF9mYXVsdCwNCj4gICAJLnBhZ2VfbWt3cml0ZSA9IG5mc192bV9w
YWdlX21rd3JpdGUsDQo+ICsJLmNsb3NlID0gbmZzX3ZtX2Nsb3NlLA0KPiAgIH07DQo+IA0KPiAg
IHN0YXRpYyBpbnQgbmZzX25lZWRfc3luY193cml0ZShzdHJ1Y3QgZmlsZSAqZmlscCwgc3RydWN0
IGlub2RlICppbm9kZSkNCj4gDQo+IEkgd291bGQgbGlrZSBzb21lIGZlZWRiYWNrIG9uIHdoYXQg
c29sdXRpb24gaXMgbW9zdCBhY2NlcHRhYmxlLCBvciBhbnkNCj4gb3RoZXIgc3VnZ2VzdGlvbnMu
DQoNCk5laXRoZXIgc29sdXRpb24gaXMgYWNjZXB0YWJsZS4gVGhpcyBpc24ndCBhIGNsb3NlLXRv
LW9wZW4gY2FjaGUNCmNvbnNpc3RlbmN5IGlzc3VlLg0KDQpUaGUgc3ludGF4IG9mIG1tYXAoKSBm
b3IgYm90aCBibG9jayBhbmQgTkZTIG1vdW50cyBpcyB0aGUgc2FtZTogd3JpdGVzDQphcmUgbm90
IGd1YXJhbnRlZWQgdG8gaGl0IHRoZSBkaXNrIHVudGlsIHlvdXIgYXBwbGljYXRpb24gZXhwbGlj
aXRseQ0KY2FsbHMgbXN5bmMoKS4NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBj
bGllbnQgbWFpbnRhaW5lcg0KDQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3
d3cubmV0YXBwLmNvbQ0KDQo=

2012-05-04 16:04:02

by Niels de Vos

[permalink] [raw]
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

On 05/03/2012 07:26 PM, Myklebust, Trond wrote:
> On Thu, 2012-05-03 at 19:07 +0200, Niels de Vos wrote:
>> On 05/03/2012 05:43 PM, Myklebust, Trond wrote:
>> > On Thu, 2012-05-03 at 17:34 +0200, Niels de Vos wrote:
>> >> When an application on an NFS-client (tested with NFSv3) executes the
>> >> following steps, data written after the close() is never flushed to the
>> >> server:
>> >>
>> >> 1. open()
>> >> 2. mmap()
>> >> 3. close()
>> >> 4.<modify data in the mmap'ed area>
>> >> 5. munmap()
>> >>
>> >> Dropping the caches (via /proc/sys/vm/drop_caches) or unmounting does not
>> >> result in the data being sent to the server.
>> >>
>> >> The man-page for mmap (man 2 mmap) does mention that closing the file-
>> >> descriptor does not munmap() the area. Using the mmap'ed area after a
>> >> close() sound valid to me (even if it may be bad practice).
>> >>
>> >> Investigation and checking showed that the NFS-client does not handle
>> >> munmap(), and only flushes on close(). To solve this problem, least two
>> >> solutions can be proposed:
>> >>
>> >> a. f_ops->release() is called on munmap() as well as on close(),
>> >> therefore release() can be used to flush data as well.
>> >> b. In the 'struct vm_operations_struct' add a .close to the
>> >> 'struct vm_area_struct' on calling mmap()/nfs_file_mmap() and flush
>> >> the data in the new close() function.
>> >>
>> >> Solution a. contains currently very few code changes:
>> >>
>> >> --- a/fs/nfs/inode.c
>> >> +++ b/fs/nfs/inode.c
>> >> @@ -713,6 +713,8 @@ int nfs_open(struct inode *inode, struct file *filp)
>> >>
>> >> int nfs_release(struct inode *inode, struct file *filp)
>> >> {
>> >> + if (S_ISREG(inode->i_mode)&& inode->i_mapping->nrpages != 0) {
>> >> + nfs_sync_mapping(inode->i_mapping);
>> >> nfs_file_clear_open_context(filp);
>> >> return 0;
>> >> }
>> >>
>> >> The disadvantage is, that nfs_release() is called on close() too. That
>> >> means this causes a flushing of dirty pages, and just after that the
>> >> nfs_file_clear_open_context() might flush again. The advantage is that
>> >> it is possible (though not done at the moment) to return an error in
>> >> case flushing failed.
>> >>
>> >> Solution b. does not provide an option to return an error, but does not
>> >> get called on each close():
>> >>
>> >> --- a/fs/nfs/file.c
>> >> +++ b/fs/nfs/file.c
>> >> @@ -547,9 +547,17 @@ out:
>> >> return ret;
>> >> }
>> >>
>> >> +static void nfs_vm_close(struct vm_area_struct * vma)
>> >> +{
>> >> + struct file *filp = vma->vm_file;
>> >> +
>> >> + nfs_file_flush(filp, (fl_owner_t)filp);
>> >> +}
>> >> +
>> >> static const struct vm_operations_struct nfs_file_vm_ops = {
>> >> .fault = filemap_fault,
>> >> .page_mkwrite = nfs_vm_page_mkwrite,
>> >> + .close = nfs_vm_close,
>> >> };
>> >>
>> >> static int nfs_need_sync_write(struct file *filp, struct inode *inode)
>> >>
>> >> I would like some feedback on what solution is most acceptable, or any
>> >> other suggestions.
>> >
>> > Neither solution is acceptable. This isn't a close-to-open cache
>> > consistency issue.
>> >
>> > The syntax of mmap() for both block and NFS mounts is the same: writes
>> > are not guaranteed to hit the disk until your application explicitly
>> > calls msync().
>> >
>>
>> Okay, that makes sense. But if the application never calls msync(), and
>> just munmap()'s the area, when should the changes be written? I did not
>> expect that unmounting just disregards the data.
>
> That suggests that the VM is failing to dirty the pages on munmap()
> before releasing the vma->vm_file. If so, then that would be a VM bug...
>

I've checked if the VM tags the pages as dirty:
- f_ops->release() is called on munmap(). An added printk there, shows
that inode->i_state is set to I_DIRTY_PAGE.
- mapping_tagged(filp->f_mapping, PAGECACHE_TAG_DIRTY) also returns true

From my understanding this is what the VM is expected to do, and the
pages are marked dirty correctly.

However, nfs_inode->ndirty and nfs_inode->ncommit are both 0. It is
unclear to me how the VM is supposed to interact with the nfs_inode.
Some clarification or suggestion what to look into would be much
appreciated.

Cheers,
Niels

2012-05-03 17:27:50

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

T24gVGh1LCAyMDEyLTA1LTAzIGF0IDE5OjA3ICswMjAwLCBOaWVscyBkZSBWb3Mgd3JvdGU6DQo+
IE9uIDA1LzAzLzIwMTIgMDU6NDMgUE0sIE15a2xlYnVzdCwgVHJvbmQgd3JvdGU6DQo+ICA+IE9u
IFRodSwgMjAxMi0wNS0wMyBhdCAxNzozNCArMDIwMCwgTmllbHMgZGUgVm9zIHdyb3RlOg0KPiAg
Pj4gV2hlbiBhbiBhcHBsaWNhdGlvbiBvbiBhbiBORlMtY2xpZW50ICh0ZXN0ZWQgd2l0aCBORlN2
MykgZXhlY3V0ZXMgdGhlDQo+ICA+PiBmb2xsb3dpbmcgc3RlcHMsIGRhdGEgd3JpdHRlbiBhZnRl
ciB0aGUgY2xvc2UoKSBpcyBuZXZlciBmbHVzaGVkIHRvIHRoZQ0KPiAgPj4gc2VydmVyOg0KPiAg
Pj4NCj4gID4+IDEuIG9wZW4oKQ0KPiAgPj4gMi4gbW1hcCgpDQo+ICA+PiAzLiBjbG9zZSgpDQo+
ICA+PiA0Ljxtb2RpZnkgZGF0YSBpbiB0aGUgbW1hcCdlZCBhcmVhPg0KPiAgPj4gNS4gbXVubWFw
KCkNCj4gID4+DQo+ICA+PiBEcm9wcGluZyB0aGUgY2FjaGVzICh2aWEgL3Byb2Mvc3lzL3ZtL2Ry
b3BfY2FjaGVzKSBvciB1bm1vdW50aW5nIGRvZXMgbm90DQo+ICA+PiByZXN1bHQgaW4gdGhlIGRh
dGEgYmVpbmcgc2VudCB0byB0aGUgc2VydmVyLg0KPiAgPj4NCj4gID4+IFRoZSBtYW4tcGFnZSBm
b3IgbW1hcCAobWFuIDIgbW1hcCkgZG9lcyBtZW50aW9uIHRoYXQgY2xvc2luZyB0aGUgZmlsZS0N
Cj4gID4+IGRlc2NyaXB0b3IgZG9lcyBub3QgbXVubWFwKCkgdGhlIGFyZWEuIFVzaW5nIHRoZSBt
bWFwJ2VkIGFyZWEgYWZ0ZXIgYQ0KPiAgPj4gY2xvc2UoKSBzb3VuZCB2YWxpZCB0byBtZSAoZXZl
biBpZiBpdCBtYXkgYmUgYmFkIHByYWN0aWNlKS4NCj4gID4+DQo+ICA+PiBJbnZlc3RpZ2F0aW9u
IGFuZCBjaGVja2luZyBzaG93ZWQgdGhhdCB0aGUgTkZTLWNsaWVudCBkb2VzIG5vdCBoYW5kbGUN
Cj4gID4+IG11bm1hcCgpLCBhbmQgb25seSBmbHVzaGVzIG9uIGNsb3NlKCkuIFRvIHNvbHZlIHRo
aXMgcHJvYmxlbSwgbGVhc3QgdHdvDQo+ICA+PiBzb2x1dGlvbnMgY2FuIGJlIHByb3Bvc2VkOg0K
PiAgPj4NCj4gID4+IGEuIGZfb3BzLT5yZWxlYXNlKCkgaXMgY2FsbGVkIG9uIG11bm1hcCgpIGFz
IHdlbGwgYXMgb24gY2xvc2UoKSwNCj4gID4+ICAgICAgdGhlcmVmb3JlIHJlbGVhc2UoKSBjYW4g
YmUgdXNlZCB0byBmbHVzaCBkYXRhIGFzIHdlbGwuDQo+ICA+PiBiLiBJbiB0aGUgJ3N0cnVjdCB2
bV9vcGVyYXRpb25zX3N0cnVjdCcgYWRkIGEgLmNsb3NlIHRvIHRoZQ0KPiAgPj4gICAgICAnc3Ry
dWN0IHZtX2FyZWFfc3RydWN0JyBvbiBjYWxsaW5nIG1tYXAoKS9uZnNfZmlsZV9tbWFwKCkgYW5k
IGZsdXNoDQo+ICA+PiAgICAgIHRoZSBkYXRhIGluIHRoZSBuZXcgY2xvc2UoKSBmdW5jdGlvbi4N
Cj4gID4+DQo+ICA+PiBTb2x1dGlvbiBhLiBjb250YWlucyBjdXJyZW50bHkgdmVyeSBmZXcgY29k
ZSBjaGFuZ2VzOg0KPiAgPj4NCj4gID4+IC0tLSBhL2ZzL25mcy9pbm9kZS5jDQo+ICA+PiArKysg
Yi9mcy9uZnMvaW5vZGUuYw0KPiAgPj4gQEAgLTcxMyw2ICs3MTMsOCBAQCBpbnQgbmZzX29wZW4o
c3RydWN0IGlub2RlICppbm9kZSwgc3RydWN0IGZpbGUgKmZpbHApDQo+ICA+Pg0KPiAgPj4gICAg
aW50IG5mc19yZWxlYXNlKHN0cnVjdCBpbm9kZSAqaW5vZGUsIHN0cnVjdCBmaWxlICpmaWxwKQ0K
PiAgPj4gICAgew0KPiAgPj4gKyAgICAgICBpZiAoU19JU1JFRyhpbm9kZS0+aV9tb2RlKSYmICBp
bm9kZS0+aV9tYXBwaW5nLT5ucnBhZ2VzICE9IDApIHsNCj4gID4+ICsgICAgICAgICAgICAgICBu
ZnNfc3luY19tYXBwaW5nKGlub2RlLT5pX21hcHBpbmcpOw0KPiAgPj4gICAgICAgICAgIG5mc19m
aWxlX2NsZWFyX29wZW5fY29udGV4dChmaWxwKTsNCj4gID4+ICAgICAgICAgICByZXR1cm4gMDsN
Cj4gID4+ICAgIH0NCj4gID4+DQo+ICA+PiBUaGUgZGlzYWR2YW50YWdlIGlzLCB0aGF0IG5mc19y
ZWxlYXNlKCkgaXMgY2FsbGVkIG9uIGNsb3NlKCkgdG9vLiBUaGF0DQo+ICA+PiBtZWFucyB0aGlz
IGNhdXNlcyBhIGZsdXNoaW5nIG9mIGRpcnR5IHBhZ2VzLCBhbmQganVzdCBhZnRlciB0aGF0IHRo
ZQ0KPiAgPj4gbmZzX2ZpbGVfY2xlYXJfb3Blbl9jb250ZXh0KCkgbWlnaHQgZmx1c2ggYWdhaW4u
IFRoZSBhZHZhbnRhZ2UgaXMgdGhhdA0KPiAgPj4gaXQgaXMgcG9zc2libGUgKHRob3VnaCBub3Qg
ZG9uZSBhdCB0aGUgbW9tZW50KSB0byByZXR1cm4gYW4gZXJyb3IgaW4NCj4gID4+IGNhc2UgZmx1
c2hpbmcgZmFpbGVkLg0KPiAgPj4NCj4gID4+IFNvbHV0aW9uIGIuIGRvZXMgbm90IHByb3ZpZGUg
YW4gb3B0aW9uIHRvIHJldHVybiBhbiBlcnJvciwgYnV0IGRvZXMgbm90DQo+ICA+PiBnZXQgY2Fs
bGVkIG9uIGVhY2ggY2xvc2UoKToNCj4gID4+DQo+ICA+PiAtLS0gYS9mcy9uZnMvZmlsZS5jDQo+
ICA+PiArKysgYi9mcy9uZnMvZmlsZS5jDQo+ICA+PiBAQCAtNTQ3LDkgKzU0NywxNyBAQCBvdXQ6
DQo+ICA+PiAgICAJcmV0dXJuIHJldDsNCj4gID4+ICAgIH0NCj4gID4+DQo+ICA+PiArc3RhdGlj
IHZvaWQgbmZzX3ZtX2Nsb3NlKHN0cnVjdCB2bV9hcmVhX3N0cnVjdCAqIHZtYSkNCj4gID4+ICt7
DQo+ICA+PiArCXN0cnVjdCBmaWxlICpmaWxwID0gdm1hLT52bV9maWxlOw0KPiAgPj4gKw0KPiAg
Pj4gKwluZnNfZmlsZV9mbHVzaChmaWxwLCAoZmxfb3duZXJfdClmaWxwKTsNCj4gID4+ICt9DQo+
ICA+PiArDQo+ICA+PiAgICBzdGF0aWMgY29uc3Qgc3RydWN0IHZtX29wZXJhdGlvbnNfc3RydWN0
IG5mc19maWxlX3ZtX29wcyA9IHsNCj4gID4+ICAgIAkuZmF1bHQgPSBmaWxlbWFwX2ZhdWx0LA0K
PiAgPj4gICAgCS5wYWdlX21rd3JpdGUgPSBuZnNfdm1fcGFnZV9ta3dyaXRlLA0KPiAgPj4gKwku
Y2xvc2UgPSBuZnNfdm1fY2xvc2UsDQo+ICA+PiAgICB9Ow0KPiAgPj4NCj4gID4+ICAgIHN0YXRp
YyBpbnQgbmZzX25lZWRfc3luY193cml0ZShzdHJ1Y3QgZmlsZSAqZmlscCwgc3RydWN0IGlub2Rl
ICppbm9kZSkNCj4gID4+DQo+ICA+PiBJIHdvdWxkIGxpa2Ugc29tZSBmZWVkYmFjayBvbiB3aGF0
IHNvbHV0aW9uIGlzIG1vc3QgYWNjZXB0YWJsZSwgb3IgYW55DQo+ICA+PiBvdGhlciBzdWdnZXN0
aW9ucy4NCj4gID4NCj4gID4gTmVpdGhlciBzb2x1dGlvbiBpcyBhY2NlcHRhYmxlLiBUaGlzIGlz
bid0IGEgY2xvc2UtdG8tb3BlbiBjYWNoZQ0KPiAgPiBjb25zaXN0ZW5jeSBpc3N1ZS4NCj4gID4N
Cj4gID4gVGhlIHN5bnRheCBvZiBtbWFwKCkgZm9yIGJvdGggYmxvY2sgYW5kIE5GUyBtb3VudHMg
aXMgdGhlIHNhbWU6IHdyaXRlcw0KPiAgPiBhcmUgbm90IGd1YXJhbnRlZWQgdG8gaGl0IHRoZSBk
aXNrIHVudGlsIHlvdXIgYXBwbGljYXRpb24gZXhwbGljaXRseQ0KPiAgPiBjYWxscyBtc3luYygp
Lg0KPiAgPg0KPiANCj4gT2theSwgdGhhdCBtYWtlcyBzZW5zZS4gQnV0IGlmIHRoZSBhcHBsaWNh
dGlvbiBuZXZlciBjYWxscyBtc3luYygpLCBhbmQNCj4ganVzdCBtdW5tYXAoKSdzIHRoZSBhcmVh
LCB3aGVuIHNob3VsZCB0aGUgY2hhbmdlcyBiZSB3cml0dGVuPyBJIGRpZCBub3QNCj4gZXhwZWN0
IHRoYXQgdW5tb3VudGluZyBqdXN0IGRpc3JlZ2FyZHMgdGhlIGRhdGEuDQoNClRoYXQgc3VnZ2Vz
dHMgdGhhdCB0aGUgVk0gaXMgZmFpbGluZyB0byBkaXJ0eSB0aGUgcGFnZXMgb24gbXVubWFwKCkN
CmJlZm9yZSByZWxlYXNpbmcgdGhlIHZtYS0+dm1fZmlsZS4gSWYgc28sIHRoZW4gdGhhdCB3b3Vs
ZCBiZSBhIFZNIGJ1Zy4uLg0KDQotLSANClRyb25kIE15a2xlYnVzdA0KTGludXggTkZTIGNsaWVu
dCBtYWludGFpbmVyDQoNCk5ldEFwcA0KVHJvbmQuTXlrbGVidXN0QG5ldGFwcC5jb20NCnd3dy5u
ZXRhcHAuY29tDQoNCg==

2012-05-07 08:49:27

by Niels de Vos

[permalink] [raw]
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

On 05/04/2012 08:29 PM, Myklebust, Trond wrote:
> On Fri, 2012-05-04 at 18:03 +0200, Niels de Vos wrote:
>> On 05/03/2012 07:26 PM, Myklebust, Trond wrote:
>> > On Thu, 2012-05-03 at 19:07 +0200, Niels de Vos wrote:
>> >> On 05/03/2012 05:43 PM, Myklebust, Trond wrote:
>> >> > On Thu, 2012-05-03 at 17:34 +0200, Niels de Vos wrote:
>> >> >> When an application on an NFS-client (tested with NFSv3)
executes the
>> >> >> following steps, data written after the close() is never
flushed to the
>> >> >> server:
>> >> >>
>> >> >> 1. open()
>> >> >> 2. mmap()
>> >> >> 3. close()
>> >> >> 4.<modify data in the mmap'ed area>
>> >> >> 5. munmap()
>> >> >>
>> >> >> Dropping the caches (via /proc/sys/vm/drop_caches) or
unmounting does not
>> >> >> result in the data being sent to the server.
>> >> >>
>> >> >> The man-page for mmap (man 2 mmap) does mention that closing
the file-
>> >> >> descriptor does not munmap() the area. Using the mmap'ed area
after a
>> >> >> close() sound valid to me (even if it may be bad practice).
>> >> >>
>> >> >> Investigation and checking showed that the NFS-client does not
handle
>> >> >> munmap(), and only flushes on close(). To solve this problem,
least two
>> >> >> solutions can be proposed:
>> >> >>
>> >> >> a. f_ops->release() is called on munmap() as well as on close(),
>> >> >> therefore release() can be used to flush data as well.
>> >> >> b. In the 'struct vm_operations_struct' add a .close to the
>> >> >> 'struct vm_area_struct' on calling mmap()/nfs_file_mmap()
and flush
>> >> >> the data in the new close() function.
>> >> >>
>> >> >> Solution a. contains currently very few code changes:
>> >> >>
>> >> >> --- a/fs/nfs/inode.c
>> >> >> +++ b/fs/nfs/inode.c
>> >> >> @@ -713,6 +713,8 @@ int nfs_open(struct inode *inode, struct
file *filp)
>> >> >>
>> >> >> int nfs_release(struct inode *inode, struct file *filp)
>> >> >> {
>> >> >> + if (S_ISREG(inode->i_mode)&&
inode->i_mapping->nrpages != 0) {
>> >> >> + nfs_sync_mapping(inode->i_mapping);
>> >> >> nfs_file_clear_open_context(filp);
>> >> >> return 0;
>> >> >> }
>> >> >>
>> >> >> The disadvantage is, that nfs_release() is called on close()
too. That
>> >> >> means this causes a flushing of dirty pages, and just after
that the
>> >> >> nfs_file_clear_open_context() might flush again. The advantage
is that
>> >> >> it is possible (though not done at the moment) to return an
error in
>> >> >> case flushing failed.
>> >> >>
>> >> >> Solution b. does not provide an option to return an error, but
does not
>> >> >> get called on each close():
>> >> >>
>> >> >> --- a/fs/nfs/file.c
>> >> >> +++ b/fs/nfs/file.c
>> >> >> @@ -547,9 +547,17 @@ out:
>> >> >> return ret;
>> >> >> }
>> >> >>
>> >> >> +static void nfs_vm_close(struct vm_area_struct * vma)
>> >> >> +{
>> >> >> + struct file *filp = vma->vm_file;
>> >> >> +
>> >> >> + nfs_file_flush(filp, (fl_owner_t)filp);
>> >> >> +}
>> >> >> +
>> >> >> static const struct vm_operations_struct nfs_file_vm_ops = {
>> >> >> .fault = filemap_fault,
>> >> >> .page_mkwrite = nfs_vm_page_mkwrite,
>> >> >> + .close = nfs_vm_close,
>> >> >> };
>> >> >>
>> >> >> static int nfs_need_sync_write(struct file *filp, struct
inode *inode)
>> >> >>
>> >> >> I would like some feedback on what solution is most acceptable,
or any
>> >> >> other suggestions.
>> >> >
>> >> > Neither solution is acceptable. This isn't a close-to-open cache
>> >> > consistency issue.
>> >> >
>> >> > The syntax of mmap() for both block and NFS mounts is the same:
writes
>> >> > are not guaranteed to hit the disk until your application explicitly
>> >> > calls msync().
>> >> >
>> >>
>> >> Okay, that makes sense. But if the application never calls msync(), and
>> >> just munmap()'s the area, when should the changes be written? I did not
>> >> expect that unmounting just disregards the data.
>> >
>> > That suggests that the VM is failing to dirty the pages on munmap()
>> > before releasing the vma->vm_file. If so, then that would be a VM bug...
>> >
>>
>> I've checked if the VM tags the pages as dirty:
>> - f_ops->release() is called on munmap(). An added printk there, shows
>> that inode->i_state is set to I_DIRTY_PAGE.
>> - mapping_tagged(filp->f_mapping, PAGECACHE_TAG_DIRTY) also returns true
>>
>> From my understanding this is what the VM is expected to do, and the
>> pages are marked dirty correctly.
>>
>> However, nfs_inode->ndirty and nfs_inode->ncommit are both 0. It is
>> unclear to me how the VM is supposed to interact with the nfs_inode.
>> Some clarification or suggestion what to look into would be much
>> appreciated.
>
> The first time the page is touched, it will to trigger a ->pg_mkwrite(),
> which in the case of NFS will set up the necessary tracking structures
> to ensure that the page is written out using the correct credentials
> etc. In the case of NFSv4, it will also ensure that the file doesn't get
> closed on the server until the page is written out to disk.
>
> When the page is cleaned (i.e. something calls clear_page_dirty_for_io()
> as part of a write to disk), the call to page_mkclean() is supposed to
> re-write-protect the pte, ensuring that any future changes will
> re-trigger pg_mkwrite().
>
> You should be able to check if/when nfs_vm_page_mkwrite() is triggered
> using 'rpcdebug -m nfs -s pagecache' to turn on the NFS page cache
> debugging printks.
>

Many thanks for the explanation! At the moment I'm a little uncertain
where the problem lays, as the problem does not occur with more recent
kernels anymore. There likely was some invalid testing on my side :-/

I think I have to hunt down what changes were made and how this affects
the writing to mmap()'d files. The explanation you gave helps a lot in
understanding how NFS handles this all.

Thanks again, and sorry for any confusion,
Niels

2012-05-03 17:07:55

by Niels de Vos

[permalink] [raw]
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

On 05/03/2012 05:43 PM, Myklebust, Trond wrote:
> On Thu, 2012-05-03 at 17:34 +0200, Niels de Vos wrote:
>> When an application on an NFS-client (tested with NFSv3) executes the
>> following steps, data written after the close() is never flushed to the
>> server:
>>
>> 1. open()
>> 2. mmap()
>> 3. close()
>> 4.<modify data in the mmap'ed area>
>> 5. munmap()
>>
>> Dropping the caches (via /proc/sys/vm/drop_caches) or unmounting does not
>> result in the data being sent to the server.
>>
>> The man-page for mmap (man 2 mmap) does mention that closing the file-
>> descriptor does not munmap() the area. Using the mmap'ed area after a
>> close() sound valid to me (even if it may be bad practice).
>>
>> Investigation and checking showed that the NFS-client does not handle
>> munmap(), and only flushes on close(). To solve this problem, least two
>> solutions can be proposed:
>>
>> a. f_ops->release() is called on munmap() as well as on close(),
>> therefore release() can be used to flush data as well.
>> b. In the 'struct vm_operations_struct' add a .close to the
>> 'struct vm_area_struct' on calling mmap()/nfs_file_mmap() and flush
>> the data in the new close() function.
>>
>> Solution a. contains currently very few code changes:
>>
>> --- a/fs/nfs/inode.c
>> +++ b/fs/nfs/inode.c
>> @@ -713,6 +713,8 @@ int nfs_open(struct inode *inode, struct file *filp)
>>
>> int nfs_release(struct inode *inode, struct file *filp)
>> {
>> + if (S_ISREG(inode->i_mode)&& inode->i_mapping->nrpages != 0) {
>> + nfs_sync_mapping(inode->i_mapping);
>> nfs_file_clear_open_context(filp);
>> return 0;
>> }
>>
>> The disadvantage is, that nfs_release() is called on close() too. That
>> means this causes a flushing of dirty pages, and just after that the
>> nfs_file_clear_open_context() might flush again. The advantage is that
>> it is possible (though not done at the moment) to return an error in
>> case flushing failed.
>>
>> Solution b. does not provide an option to return an error, but does not
>> get called on each close():
>>
>> --- a/fs/nfs/file.c
>> +++ b/fs/nfs/file.c
>> @@ -547,9 +547,17 @@ out:
>> return ret;
>> }
>>
>> +static void nfs_vm_close(struct vm_area_struct * vma)
>> +{
>> + struct file *filp = vma->vm_file;
>> +
>> + nfs_file_flush(filp, (fl_owner_t)filp);
>> +}
>> +
>> static const struct vm_operations_struct nfs_file_vm_ops = {
>> .fault = filemap_fault,
>> .page_mkwrite = nfs_vm_page_mkwrite,
>> + .close = nfs_vm_close,
>> };
>>
>> static int nfs_need_sync_write(struct file *filp, struct inode *inode)
>>
>> I would like some feedback on what solution is most acceptable, or any
>> other suggestions.
>
> Neither solution is acceptable. This isn't a close-to-open cache
> consistency issue.
>
> The syntax of mmap() for both block and NFS mounts is the same: writes
> are not guaranteed to hit the disk until your application explicitly
> calls msync().
>

Okay, that makes sense. But if the application never calls msync(), and
just munmap()'s the area, when should the changes be written? I did not
expect that unmounting just disregards the data.

Any suggestions on correcting this (if it should be corrected) are
welcome.

Thanks again,
Niels

2012-05-04 18:29:37

by Myklebust, Trond

[permalink] [raw]
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor

T24gRnJpLCAyMDEyLTA1LTA0IGF0IDE4OjAzICswMjAwLCBOaWVscyBkZSBWb3Mgd3JvdGU6DQo+
IE9uIDA1LzAzLzIwMTIgMDc6MjYgUE0sIE15a2xlYnVzdCwgVHJvbmQgd3JvdGU6DQo+ICA+IE9u
IFRodSwgMjAxMi0wNS0wMyBhdCAxOTowNyArMDIwMCwgTmllbHMgZGUgVm9zIHdyb3RlOg0KPiAg
Pj4gT24gMDUvMDMvMjAxMiAwNTo0MyBQTSwgTXlrbGVidXN0LCBUcm9uZCB3cm90ZToNCj4gID4+
ICAgPiAgT24gVGh1LCAyMDEyLTA1LTAzIGF0IDE3OjM0ICswMjAwLCBOaWVscyBkZSBWb3Mgd3Jv
dGU6DQo+ICA+PiAgID4+ICBXaGVuIGFuIGFwcGxpY2F0aW9uIG9uIGFuIE5GUy1jbGllbnQgKHRl
c3RlZCB3aXRoIE5GU3YzKSBleGVjdXRlcyB0aGUNCj4gID4+ICAgPj4gIGZvbGxvd2luZyBzdGVw
cywgZGF0YSB3cml0dGVuIGFmdGVyIHRoZSBjbG9zZSgpIGlzIG5ldmVyIGZsdXNoZWQgdG8gdGhl
DQo+ICA+PiAgID4+ICBzZXJ2ZXI6DQo+ICA+PiAgID4+DQo+ICA+PiAgID4+ICAxLiBvcGVuKCkN
Cj4gID4+ICAgPj4gIDIuIG1tYXAoKQ0KPiAgPj4gICA+PiAgMy4gY2xvc2UoKQ0KPiAgPj4gICA+
PiAgNC48bW9kaWZ5IGRhdGEgaW4gdGhlIG1tYXAnZWQgYXJlYT4NCj4gID4+ICAgPj4gIDUuIG11
bm1hcCgpDQo+ICA+PiAgID4+DQo+ICA+PiAgID4+ICBEcm9wcGluZyB0aGUgY2FjaGVzICh2aWEg
L3Byb2Mvc3lzL3ZtL2Ryb3BfY2FjaGVzKSBvciB1bm1vdW50aW5nIGRvZXMgbm90DQo+ICA+PiAg
ID4+ICByZXN1bHQgaW4gdGhlIGRhdGEgYmVpbmcgc2VudCB0byB0aGUgc2VydmVyLg0KPiAgPj4g
ICA+Pg0KPiAgPj4gICA+PiAgVGhlIG1hbi1wYWdlIGZvciBtbWFwIChtYW4gMiBtbWFwKSBkb2Vz
IG1lbnRpb24gdGhhdCBjbG9zaW5nIHRoZSBmaWxlLQ0KPiAgPj4gICA+PiAgZGVzY3JpcHRvciBk
b2VzIG5vdCBtdW5tYXAoKSB0aGUgYXJlYS4gVXNpbmcgdGhlIG1tYXAnZWQgYXJlYSBhZnRlciBh
DQo+ICA+PiAgID4+ICBjbG9zZSgpIHNvdW5kIHZhbGlkIHRvIG1lIChldmVuIGlmIGl0IG1heSBi
ZSBiYWQgcHJhY3RpY2UpLg0KPiAgPj4gICA+Pg0KPiAgPj4gICA+PiAgSW52ZXN0aWdhdGlvbiBh
bmQgY2hlY2tpbmcgc2hvd2VkIHRoYXQgdGhlIE5GUy1jbGllbnQgZG9lcyBub3QgaGFuZGxlDQo+
ICA+PiAgID4+ICBtdW5tYXAoKSwgYW5kIG9ubHkgZmx1c2hlcyBvbiBjbG9zZSgpLiBUbyBzb2x2
ZSB0aGlzIHByb2JsZW0sIGxlYXN0IHR3bw0KPiAgPj4gICA+PiAgc29sdXRpb25zIGNhbiBiZSBw
cm9wb3NlZDoNCj4gID4+ICAgPj4NCj4gID4+ICAgPj4gIGEuIGZfb3BzLT5yZWxlYXNlKCkgaXMg
Y2FsbGVkIG9uIG11bm1hcCgpIGFzIHdlbGwgYXMgb24gY2xvc2UoKSwNCj4gID4+ICAgPj4gICAg
ICAgdGhlcmVmb3JlIHJlbGVhc2UoKSBjYW4gYmUgdXNlZCB0byBmbHVzaCBkYXRhIGFzIHdlbGwu
DQo+ICA+PiAgID4+ICBiLiBJbiB0aGUgJ3N0cnVjdCB2bV9vcGVyYXRpb25zX3N0cnVjdCcgYWRk
IGEgLmNsb3NlIHRvIHRoZQ0KPiAgPj4gICA+PiAgICAgICAnc3RydWN0IHZtX2FyZWFfc3RydWN0
JyBvbiBjYWxsaW5nIG1tYXAoKS9uZnNfZmlsZV9tbWFwKCkgYW5kIGZsdXNoDQo+ICA+PiAgID4+
ICAgICAgIHRoZSBkYXRhIGluIHRoZSBuZXcgY2xvc2UoKSBmdW5jdGlvbi4NCj4gID4+ICAgPj4N
Cj4gID4+ICAgPj4gIFNvbHV0aW9uIGEuIGNvbnRhaW5zIGN1cnJlbnRseSB2ZXJ5IGZldyBjb2Rl
IGNoYW5nZXM6DQo+ICA+PiAgID4+DQo+ICA+PiAgID4+ICAtLS0gYS9mcy9uZnMvaW5vZGUuYw0K
PiAgPj4gICA+PiAgKysrIGIvZnMvbmZzL2lub2RlLmMNCj4gID4+ICAgPj4gIEBAIC03MTMsNiAr
NzEzLDggQEAgaW50IG5mc19vcGVuKHN0cnVjdCBpbm9kZSAqaW5vZGUsIHN0cnVjdCBmaWxlICpm
aWxwKQ0KPiAgPj4gICA+Pg0KPiAgPj4gICA+PiAgICAgaW50IG5mc19yZWxlYXNlKHN0cnVjdCBp
bm9kZSAqaW5vZGUsIHN0cnVjdCBmaWxlICpmaWxwKQ0KPiAgPj4gICA+PiAgICAgew0KPiAgPj4g
ICA+PiAgKyAgICAgICBpZiAoU19JU1JFRyhpbm9kZS0+aV9tb2RlKSYmICAgaW5vZGUtPmlfbWFw
cGluZy0+bnJwYWdlcyAhPSAwKSB7DQo+ICA+PiAgID4+ICArICAgICAgICAgICAgICAgbmZzX3N5
bmNfbWFwcGluZyhpbm9kZS0+aV9tYXBwaW5nKTsNCj4gID4+ICAgPj4gICAgICAgICAgICBuZnNf
ZmlsZV9jbGVhcl9vcGVuX2NvbnRleHQoZmlscCk7DQo+ICA+PiAgID4+ICAgICAgICAgICAgcmV0
dXJuIDA7DQo+ICA+PiAgID4+ICAgICB9DQo+ICA+PiAgID4+DQo+ICA+PiAgID4+ICBUaGUgZGlz
YWR2YW50YWdlIGlzLCB0aGF0IG5mc19yZWxlYXNlKCkgaXMgY2FsbGVkIG9uIGNsb3NlKCkgdG9v
LiBUaGF0DQo+ICA+PiAgID4+ICBtZWFucyB0aGlzIGNhdXNlcyBhIGZsdXNoaW5nIG9mIGRpcnR5
IHBhZ2VzLCBhbmQganVzdCBhZnRlciB0aGF0IHRoZQ0KPiAgPj4gICA+PiAgbmZzX2ZpbGVfY2xl
YXJfb3Blbl9jb250ZXh0KCkgbWlnaHQgZmx1c2ggYWdhaW4uIFRoZSBhZHZhbnRhZ2UgaXMgdGhh
dA0KPiAgPj4gICA+PiAgaXQgaXMgcG9zc2libGUgKHRob3VnaCBub3QgZG9uZSBhdCB0aGUgbW9t
ZW50KSB0byByZXR1cm4gYW4gZXJyb3IgaW4NCj4gID4+ICAgPj4gIGNhc2UgZmx1c2hpbmcgZmFp
bGVkLg0KPiAgPj4gICA+Pg0KPiAgPj4gICA+PiAgU29sdXRpb24gYi4gZG9lcyBub3QgcHJvdmlk
ZSBhbiBvcHRpb24gdG8gcmV0dXJuIGFuIGVycm9yLCBidXQgZG9lcyBub3QNCj4gID4+ICAgPj4g
IGdldCBjYWxsZWQgb24gZWFjaCBjbG9zZSgpOg0KPiAgPj4gICA+Pg0KPiAgPj4gICA+PiAgLS0t
IGEvZnMvbmZzL2ZpbGUuYw0KPiAgPj4gICA+PiAgKysrIGIvZnMvbmZzL2ZpbGUuYw0KPiAgPj4g
ICA+PiAgQEAgLTU0Nyw5ICs1NDcsMTcgQEAgb3V0Og0KPiAgPj4gICA+PiAgICAgCXJldHVybiBy
ZXQ7DQo+ICA+PiAgID4+ICAgICB9DQo+ICA+PiAgID4+DQo+ICA+PiAgID4+ICArc3RhdGljIHZv
aWQgbmZzX3ZtX2Nsb3NlKHN0cnVjdCB2bV9hcmVhX3N0cnVjdCAqIHZtYSkNCj4gID4+ICAgPj4g
ICt7DQo+ICA+PiAgID4+ICArCXN0cnVjdCBmaWxlICpmaWxwID0gdm1hLT52bV9maWxlOw0KPiAg
Pj4gICA+PiAgKw0KPiAgPj4gICA+PiAgKwluZnNfZmlsZV9mbHVzaChmaWxwLCAoZmxfb3duZXJf
dClmaWxwKTsNCj4gID4+ICAgPj4gICt9DQo+ICA+PiAgID4+ICArDQo+ICA+PiAgID4+ICAgICBz
dGF0aWMgY29uc3Qgc3RydWN0IHZtX29wZXJhdGlvbnNfc3RydWN0IG5mc19maWxlX3ZtX29wcyA9
IHsNCj4gID4+ICAgPj4gICAgIAkuZmF1bHQgPSBmaWxlbWFwX2ZhdWx0LA0KPiAgPj4gICA+PiAg
ICAgCS5wYWdlX21rd3JpdGUgPSBuZnNfdm1fcGFnZV9ta3dyaXRlLA0KPiAgPj4gICA+PiAgKwku
Y2xvc2UgPSBuZnNfdm1fY2xvc2UsDQo+ICA+PiAgID4+ICAgICB9Ow0KPiAgPj4gICA+Pg0KPiAg
Pj4gICA+PiAgICAgc3RhdGljIGludCBuZnNfbmVlZF9zeW5jX3dyaXRlKHN0cnVjdCBmaWxlICpm
aWxwLCBzdHJ1Y3QgaW5vZGUgKmlub2RlKQ0KPiAgPj4gICA+Pg0KPiAgPj4gICA+PiAgSSB3b3Vs
ZCBsaWtlIHNvbWUgZmVlZGJhY2sgb24gd2hhdCBzb2x1dGlvbiBpcyBtb3N0IGFjY2VwdGFibGUs
IG9yIGFueQ0KPiAgPj4gICA+PiAgb3RoZXIgc3VnZ2VzdGlvbnMuDQo+ICA+PiAgID4NCj4gID4+
ICAgPiAgTmVpdGhlciBzb2x1dGlvbiBpcyBhY2NlcHRhYmxlLiBUaGlzIGlzbid0IGEgY2xvc2Ut
dG8tb3BlbiBjYWNoZQ0KPiAgPj4gICA+ICBjb25zaXN0ZW5jeSBpc3N1ZS4NCj4gID4+ICAgPg0K
PiAgPj4gICA+ICBUaGUgc3ludGF4IG9mIG1tYXAoKSBmb3IgYm90aCBibG9jayBhbmQgTkZTIG1v
dW50cyBpcyB0aGUgc2FtZTogd3JpdGVzDQo+ICA+PiAgID4gIGFyZSBub3QgZ3VhcmFudGVlZCB0
byBoaXQgdGhlIGRpc2sgdW50aWwgeW91ciBhcHBsaWNhdGlvbiBleHBsaWNpdGx5DQo+ICA+PiAg
ID4gIGNhbGxzIG1zeW5jKCkuDQo+ICA+PiAgID4NCj4gID4+DQo+ICA+PiBPa2F5LCB0aGF0IG1h
a2VzIHNlbnNlLiBCdXQgaWYgdGhlIGFwcGxpY2F0aW9uIG5ldmVyIGNhbGxzIG1zeW5jKCksIGFu
ZA0KPiAgPj4ganVzdCBtdW5tYXAoKSdzIHRoZSBhcmVhLCB3aGVuIHNob3VsZCB0aGUgY2hhbmdl
cyBiZSB3cml0dGVuPyBJIGRpZCBub3QNCj4gID4+IGV4cGVjdCB0aGF0IHVubW91bnRpbmcganVz
dCBkaXNyZWdhcmRzIHRoZSBkYXRhLg0KPiAgPg0KPiAgPiBUaGF0IHN1Z2dlc3RzIHRoYXQgdGhl
IFZNIGlzIGZhaWxpbmcgdG8gZGlydHkgdGhlIHBhZ2VzIG9uIG11bm1hcCgpDQo+ICA+IGJlZm9y
ZSByZWxlYXNpbmcgdGhlIHZtYS0+dm1fZmlsZS4gSWYgc28sIHRoZW4gdGhhdCB3b3VsZCBiZSBh
IFZNIGJ1Zy4uLg0KPiAgPg0KPiANCj4gSSd2ZSBjaGVja2VkIGlmIHRoZSBWTSB0YWdzIHRoZSBw
YWdlcyBhcyBkaXJ0eToNCj4gLSBmX29wcy0+cmVsZWFzZSgpIGlzIGNhbGxlZCBvbiBtdW5tYXAo
KS4gQW4gYWRkZWQgcHJpbnRrIHRoZXJlLCBzaG93cw0KPiAgICB0aGF0IGlub2RlLT5pX3N0YXRl
IGlzIHNldCB0byBJX0RJUlRZX1BBR0UuDQo+IC0gbWFwcGluZ190YWdnZWQoZmlscC0+Zl9tYXBw
aW5nLCBQQUdFQ0FDSEVfVEFHX0RJUlRZKSBhbHNvIHJldHVybnMgdHJ1ZQ0KPiANCj4gIEZyb20g
bXkgdW5kZXJzdGFuZGluZyB0aGlzIGlzIHdoYXQgdGhlIFZNIGlzIGV4cGVjdGVkIHRvIGRvLCBh
bmQgdGhlDQo+IHBhZ2VzIGFyZSBtYXJrZWQgZGlydHkgY29ycmVjdGx5Lg0KPiANCj4gSG93ZXZl
ciwgbmZzX2lub2RlLT5uZGlydHkgYW5kIG5mc19pbm9kZS0+bmNvbW1pdCBhcmUgYm90aCAwLiBJ
dCBpcw0KPiB1bmNsZWFyIHRvIG1lIGhvdyB0aGUgVk0gaXMgc3VwcG9zZWQgdG8gaW50ZXJhY3Qg
d2l0aCB0aGUgbmZzX2lub2RlLg0KPiBTb21lIGNsYXJpZmljYXRpb24gb3Igc3VnZ2VzdGlvbiB3
aGF0IHRvIGxvb2sgaW50byB3b3VsZCBiZSBtdWNoDQo+IGFwcHJlY2lhdGVkLg0KDQpUaGUgZmly
c3QgdGltZSB0aGUgcGFnZSBpcyB0b3VjaGVkLCBpdCB3aWxsIHRvIHRyaWdnZXIgYSAtPnBnX21r
d3JpdGUoKSwNCndoaWNoIGluIHRoZSBjYXNlIG9mIE5GUyB3aWxsIHNldCB1cCB0aGUgbmVjZXNz
YXJ5IHRyYWNraW5nIHN0cnVjdHVyZXMNCnRvIGVuc3VyZSB0aGF0IHRoZSBwYWdlIGlzIHdyaXR0
ZW4gb3V0IHVzaW5nIHRoZSBjb3JyZWN0IGNyZWRlbnRpYWxzDQpldGMuIEluIHRoZSBjYXNlIG9m
IE5GU3Y0LCBpdCB3aWxsIGFsc28gZW5zdXJlIHRoYXQgdGhlIGZpbGUgZG9lc24ndCBnZXQNCmNs
b3NlZCBvbiB0aGUgc2VydmVyIHVudGlsIHRoZSBwYWdlIGlzIHdyaXR0ZW4gb3V0IHRvIGRpc2su
DQoNCldoZW4gdGhlIHBhZ2UgaXMgY2xlYW5lZCAoaS5lLiBzb21ldGhpbmcgY2FsbHMgY2xlYXJf
cGFnZV9kaXJ0eV9mb3JfaW8oKQ0KYXMgcGFydCBvZiBhIHdyaXRlIHRvIGRpc2spLCB0aGUgY2Fs
bCB0byBwYWdlX21rY2xlYW4oKSBpcyBzdXBwb3NlZCB0bw0KcmUtd3JpdGUtcHJvdGVjdCB0aGUg
cHRlLCBlbnN1cmluZyB0aGF0IGFueSBmdXR1cmUgY2hhbmdlcyB3aWxsDQpyZS10cmlnZ2VyIHBn
X21rd3JpdGUoKS4NCg0KWW91IHNob3VsZCBiZSBhYmxlIHRvIGNoZWNrIGlmL3doZW4gbmZzX3Zt
X3BhZ2VfbWt3cml0ZSgpIGlzIHRyaWdnZXJlZA0KdXNpbmcgJ3JwY2RlYnVnIC1tIG5mcyAtcyBw
YWdlY2FjaGUnIHRvIHR1cm4gb24gdGhlIE5GUyBwYWdlIGNhY2hlDQpkZWJ1Z2dpbmcgcHJpbnRr
cy4NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0K
DQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNvbQ0KDQo=