2015-10-07 18:57:21

by Trond Myklebust

[permalink] [raw]
Subject: RCU caching regression in kernel v4.1+

Hi Al,

Please could you take a look at the bugzilla entry in
https://bugzilla.kernel.org/show_bug.cgi?id=104911 ?

It describes a NFS caching regression that appears to be caused by
commit 766c4cbfacd8634d7580bac6a1b8456e63de3e84 ("namei:
d_is_negative() should be checked before ->d_seq validation").

Shouldn't that test for 'if (negative) return -ENOENT;' happen after
the call to d_revalidate() in lookup_fast()? If not, we can end up
caching negative dentries forever, AFAICS...

Cheers
Trond


2015-10-08 12:54:53

by Trond Myklebust

[permalink] [raw]
Subject: Re: RCU caching regression in kernel v4.1+

On Wed, 2015-10-07 at 14:57 -0400, Trond Myklebust wrote:
> Hi Al,
>
> Please could you take a look at the bugzilla entry in
> https://bugzilla.kernel.org/show_bug.cgi?id=104911 ?
>
> It describes a NFS caching regression that appears to be caused by
> commit 766c4cbfacd8634d7580bac6a1b8456e63de3e84 ("namei:
> d_is_negative() should be checked before ->d_seq validation").
>
> Shouldn't that test for 'if (negative) return -ENOENT;' happen after
> the call to d_revalidate() in lookup_fast()? If not, we can end up
> caching negative dentries forever, AFAICS...
>
> Cheers
> Trond

Leandro, can you please test if the following patch helps in any way?

Cheers
Trond

8<-----------------------------------------------------------------

2015-10-08 17:28:41

by Leandro Awa

[permalink] [raw]
Subject: RE: RCU caching regression in kernel v4.1+

SGkgVHJvbmQsDQoNClN1cmUuICAgSSdtIHJ1bm5pbmcgdGhlIHRlc3Qgbm93LiAgSXQgc2hvdWxk
IGJlIGRvbmUgd2l0aGluIHRoZSBuZXh0IDQgaG91cnMuDQoNCkJlc3QgUmVnYXJkcywNCkxlYW5k
cm8gQXdhDQoNCg0KLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IFRyb25kIE15a2xl
YnVzdCBbbWFpbHRvOnRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb21dIA0KU2VudDogVGh1
cnNkYXksIE9jdG9iZXIgMDgsIDIwMTUgNTo1NSBBTQ0KVG86IEFsZXhhbmRlciBWaXJvDQpDYzog
TGludXggTkZTIE1haWxpbmcgTGlzdDsgTGVhbmRybyBBd2E7IExpbnV4IEZTLWRldmVsIE1haWxp
bmcgTGlzdA0KU3ViamVjdDogUmU6IFJDVSBjYWNoaW5nIHJlZ3Jlc3Npb24gaW4ga2VybmVsIHY0
LjErDQoNCk9uIFdlZCwgMjAxNS0xMC0wNyBhdCAxNDo1NyAtMDQwMCwgVHJvbmQgTXlrbGVidXN0
IHdyb3RlOg0KPiBIaSBBbCwNCj4gDQo+IFBsZWFzZSBjb3VsZCB5b3UgdGFrZSBhIGxvb2sgYXQg
dGhlIGJ1Z3ppbGxhIGVudHJ5IGluDQo+IGh0dHBzOi8vYnVnemlsbGEua2VybmVsLm9yZy9zaG93
X2J1Zy5jZ2k/aWQ9MTA0OTExID8NCj4gDQo+IEl0IGRlc2NyaWJlcyBhIE5GUyBjYWNoaW5nIHJl
Z3Jlc3Npb24gdGhhdCBhcHBlYXJzIHRvIGJlIGNhdXNlZCBieSANCj4gY29tbWl0IDc2NmM0Y2Jm
YWNkODYzNGQ3NTgwYmFjNmExYjg0NTZlNjNkZTNlODQgKCJuYW1laToNCj4gZF9pc19uZWdhdGl2
ZSgpIHNob3VsZCBiZSBjaGVja2VkIGJlZm9yZSAtPmRfc2VxIHZhbGlkYXRpb24iKS4NCj4gDQo+
IFNob3VsZG4ndCB0aGF0IHRlc3QgZm9yICdpZiAobmVnYXRpdmUpIHJldHVybiAtRU5PRU5UOycg
aGFwcGVuIGFmdGVyIA0KPiB0aGUgY2FsbCB0byBkX3JldmFsaWRhdGUoKSBpbiBsb29rdXBfZmFz
dCgpPyBJZiBub3QsIHdlIGNhbiBlbmQgdXAgDQo+IGNhY2hpbmcgbmVnYXRpdmUgZGVudHJpZXMg
Zm9yZXZlciwgQUZBSUNTLi4uDQo+IA0KPiBDaGVlcnMNCj4gICBUcm9uZA0KDQpMZWFuZHJvLCBj
YW4geW91IHBsZWFzZSB0ZXN0IGlmIHRoZSBmb2xsb3dpbmcgcGF0Y2ggaGVscHMgaW4gYW55IHdh
eT8NCg0KQ2hlZXJzDQogIFRyb25kDQoNCjg8LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0NCkZyb20gZWI2MWVjZTU3MzliYjJm
M2I2ZDAzZGQ4Y2E4ZTMzNWJmMGQxMjY4NyBNb24gU2VwIDE3IDAwOjAwOjAwIDIwMDENCkZyb206
IFRyb25kIE15a2xlYnVzdCA8dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbT4NCkRhdGU6
IFRodSwgOCBPY3QgMjAxNSAwODo0NDowMCAtMDQwMA0KU3ViamVjdDogW1BBVENIXSBuYW1laTog
cmVzdWx0cyBvZiBkX2lzX25lZ2F0aXZlKCkgc2hvdWxkIGJlIGNoZWNrZWQgYWZ0ZXIgIGRlbnRy
eSByZXZhbGlkYXRpb24NCg0KTGVhbmRybyBBd2Egd3JpdGVzOg0KQWZ0ZXIgc3dpdGNoaW5nIHRv
IHZlcnNpb24gNC4xLjYsIG91ciBwYXJhbGxlbGl6ZWQgYW5kIGRpc3RyaWJ1dGVkIHdvcmtmbG93
cyBub3cgIGZhaWwgY29uc2lzdGVudGx5IHdpdGggZXJyb3JzIG9mIHRoZSBmb3JtOg0KDQpUMzQ6
IC4vcmVnZXguYzozOToyMjogZXJyb3I6IGNvbmZpZy5oOiBObyBzdWNoIGZpbGUgb3IgZGlyZWN0
b3J5DQoNCkZyb20gb3VyICdnaXQgYmlzZWN0JyB0ZXN0aW5nLCB0aGUgZm9sbG93aW5nIGNvbW1p
dCBhcHBlYXJzIHRvIGJlIHRoZSBwb3NzaWJsZSBjYXVzZSBvZiB0aGUgYmVoYXZpb3Igd2UndmUg
YmVlbiBzZWVpbmc6IGNvbW1pdCA3NjZjNGNiZmFjZDgNCg0KVGhlIGlzc3VlIGlzIHRoYXQgcmV2
YWxpZGF0aW9uIG1heSBjYXVzZSB0aGUgZGVudHJ5IHRvIGJlIGRyb3BwZWQgaW4gTkZTIGlmLCBz
YXksIHRoZSBjbGllbnQgbm90ZXMgdGhhdCB0aGUgZGlyZWN0b3J5IHRpbWVzdGFtcHMgaGF2ZSBj
aGFuZ2VkLg0KDQpSZXBvcnRlZC1ieTogTGVhbmRybyBBd2EgPGxhd2FAbnZpZGlhLmNvbT4NCkxp
bms6IGh0dHBzOi8vYnVnemlsbGEua2VybmVsLm9yZy9zaG93X2J1Zy5jZ2k/aWQ9MTA0OTExDQpG
aXhlczogNzY2YzRjYmZhY2Q4ICgibmFtZWk6IGRfaXNfbmVnYXRpdmUoKSBzaG91bGQgYmUgY2hl
Y2tlZC4uLiIpDQpDYzogc3RhYmxlQHZnZXIua2VybmVsLm9yZyAjIHY0LjErDQpTaWduZWQtb2Zm
LWJ5OiBUcm9uZCBNeWtsZWJ1c3QgPHRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20+DQot
LS0NCiBmcy9uYW1laS5jIHwgOCArKysrKystLQ0KIDEgZmlsZSBjaGFuZ2VkLCA2IGluc2VydGlv
bnMoKyksIDIgZGVsZXRpb25zKC0pDQoNCmRpZmYgLS1naXQgYS9mcy9uYW1laS5jIGIvZnMvbmFt
ZWkuYw0KaW5kZXggNzI2ZDIxMWRiNDg0Li4zM2U5NDk1YTMxMjkgMTAwNjQ0DQotLS0gYS9mcy9u
YW1laS5jDQorKysgYi9mcy9uYW1laS5jDQpAQCAtMTU1OCw4ICsxNTU4LDYgQEAgc3RhdGljIGlu
dCBsb29rdXBfZmFzdChzdHJ1Y3QgbmFtZWlkYXRhICpuZCwNCiAJCW5lZ2F0aXZlID0gZF9pc19u
ZWdhdGl2ZShkZW50cnkpOw0KIAkJaWYgKHJlYWRfc2VxY291bnRfcmV0cnkoJmRlbnRyeS0+ZF9z
ZXEsIHNlcSkpDQogCQkJcmV0dXJuIC1FQ0hJTEQ7DQotCQlpZiAobmVnYXRpdmUpDQotCQkJcmV0
dXJuIC1FTk9FTlQ7DQogDQogCQkvKg0KIAkJICogVGhpcyBzZXF1ZW5jZSBjb3VudCB2YWxpZGF0
ZXMgdGhhdCB0aGUgcGFyZW50IGhhZCBubyBAQCAtMTU4MCw2ICsxNTc4LDEyIEBAIHN0YXRpYyBp
bnQgbG9va3VwX2Zhc3Qoc3RydWN0IG5hbWVpZGF0YSAqbmQsDQogCQkJCWdvdG8gdW5sYXp5Ow0K
IAkJCX0NCiAJCX0NCisJCS8qDQorCQkgKiBOb3RlOiBkbyBuZWdhdGl2ZSBkZW50cnkgY2hlY2sg
YWZ0ZXIgcmV2YWxpZGF0aW9uIGluDQorCQkgKiBjYXNlIHRoYXQgZHJvcHMgaXQuDQorCQkgKi8N
CisJCWlmIChuZWdhdGl2ZSkNCisJCQlyZXR1cm4gLUVOT0VOVDsNCiAJCXBhdGgtPm1udCA9IG1u
dDsNCiAJCXBhdGgtPmRlbnRyeSA9IGRlbnRyeTsNCiAJCWlmIChsaWtlbHkoX19mb2xsb3dfbW91
bnRfcmN1KG5kLCBwYXRoLCBpbm9kZSwgc2VxcCkpKQ0KLS0NCjIuNC4zDQoNCi0tDQpUcm9uZCBN
eWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lciwgUHJpbWFyeURhdGEgdHJvbmQu
bXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbQ0KDQoNCg0KDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLQ0KVGhpcyBlbWFpbCBtZXNzYWdlIGlzIGZvciB0aGUgc29sZSB1c2Ugb2YgdGhlIGludGVu
ZGVkIHJlY2lwaWVudChzKSBhbmQgbWF5IGNvbnRhaW4NCmNvbmZpZGVudGlhbCBpbmZvcm1hdGlv
bi4gIEFueSB1bmF1dGhvcml6ZWQgcmV2aWV3LCB1c2UsIGRpc2Nsb3N1cmUgb3IgZGlzdHJpYnV0
aW9uDQppcyBwcm9oaWJpdGVkLiAgSWYgeW91IGFyZSBub3QgdGhlIGludGVuZGVkIHJlY2lwaWVu
dCwgcGxlYXNlIGNvbnRhY3QgdGhlIHNlbmRlciBieQ0KcmVwbHkgZW1haWwgYW5kIGRlc3Ryb3kg
YWxsIGNvcGllcyBvZiB0aGUgb3JpZ2luYWwgbWVzc2FnZS4NCi0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tDQo=

2015-10-09 00:01:18

by Leandro Awa

[permalink] [raw]
Subject: RE: RCU caching regression in kernel v4.1+

RnlpICAgICBUaGUgcGF0Y2ggZGVmaW5pdGVseSBoZWxwZWQuDQoNClRoYW5rIHlvdS4NCg0KUmVn
YXJkcywNCkxlYW5kcm8gQXdhDQpNSVMtRmFybSBTdXBwb3J0DQoNCg0KLS0tLS1PcmlnaW5hbCBN
ZXNzYWdlLS0tLS0NCkZyb206IExlYW5kcm8gQXdhIA0KU2VudDogVGh1cnNkYXksIE9jdG9iZXIg
MDgsIDIwMTUgMTA6MjkgQU0NClRvOiAnVHJvbmQgTXlrbGVidXN0JzsgQWxleGFuZGVyIFZpcm8N
CkNjOiBMaW51eCBORlMgTWFpbGluZyBMaXN0OyBMaW51eCBGUy1kZXZlbCBNYWlsaW5nIExpc3QN
ClN1YmplY3Q6IFJFOiBSQ1UgY2FjaGluZyByZWdyZXNzaW9uIGluIGtlcm5lbCB2NC4xKw0KDQpI
aSBUcm9uZCwNCg0KU3VyZS4gICBJJ20gcnVubmluZyB0aGUgdGVzdCBub3cuICBJdCBzaG91bGQg
YmUgZG9uZSB3aXRoaW4gdGhlIG5leHQgNCBob3Vycy4NCg0KQmVzdCBSZWdhcmRzLA0KTGVhbmRy
byBBd2ENCg0KDQotLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogVHJvbmQgTXlrbGVi
dXN0IFttYWlsdG86dHJvbmQubXlrbGVidXN0QHByaW1hcnlkYXRhLmNvbV0NClNlbnQ6IFRodXJz
ZGF5LCBPY3RvYmVyIDA4LCAyMDE1IDU6NTUgQU0NClRvOiBBbGV4YW5kZXIgVmlybw0KQ2M6IExp
bnV4IE5GUyBNYWlsaW5nIExpc3Q7IExlYW5kcm8gQXdhOyBMaW51eCBGUy1kZXZlbCBNYWlsaW5n
IExpc3QNClN1YmplY3Q6IFJlOiBSQ1UgY2FjaGluZyByZWdyZXNzaW9uIGluIGtlcm5lbCB2NC4x
Kw0KDQpPbiBXZWQsIDIwMTUtMTAtMDcgYXQgMTQ6NTcgLTA0MDAsIFRyb25kIE15a2xlYnVzdCB3
cm90ZToNCj4gSGkgQWwsDQo+IA0KPiBQbGVhc2UgY291bGQgeW91IHRha2UgYSBsb29rIGF0IHRo
ZSBidWd6aWxsYSBlbnRyeSBpbg0KPiBodHRwczovL2J1Z3ppbGxhLmtlcm5lbC5vcmcvc2hvd19i
dWcuY2dpP2lkPTEwNDkxMSA/DQo+IA0KPiBJdCBkZXNjcmliZXMgYSBORlMgY2FjaGluZyByZWdy
ZXNzaW9uIHRoYXQgYXBwZWFycyB0byBiZSBjYXVzZWQgYnkgDQo+IGNvbW1pdCA3NjZjNGNiZmFj
ZDg2MzRkNzU4MGJhYzZhMWI4NDU2ZTYzZGUzZTg0ICgibmFtZWk6DQo+IGRfaXNfbmVnYXRpdmUo
KSBzaG91bGQgYmUgY2hlY2tlZCBiZWZvcmUgLT5kX3NlcSB2YWxpZGF0aW9uIikuDQo+IA0KPiBT
aG91bGRuJ3QgdGhhdCB0ZXN0IGZvciAnaWYgKG5lZ2F0aXZlKSByZXR1cm4gLUVOT0VOVDsnIGhh
cHBlbiBhZnRlciANCj4gdGhlIGNhbGwgdG8gZF9yZXZhbGlkYXRlKCkgaW4gbG9va3VwX2Zhc3Qo
KT8gSWYgbm90LCB3ZSBjYW4gZW5kIHVwIA0KPiBjYWNoaW5nIG5lZ2F0aXZlIGRlbnRyaWVzIGZv
cmV2ZXIsIEFGQUlDUy4uLg0KPiANCj4gQ2hlZXJzDQo+ICAgVHJvbmQNCg0KTGVhbmRybywgY2Fu
IHlvdSBwbGVhc2UgdGVzdCBpZiB0aGUgZm9sbG93aW5nIHBhdGNoIGhlbHBzIGluIGFueSB3YXk/
DQoNCkNoZWVycw0KICBUcm9uZA0KDQo4PC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQpGcm9tIGViNjFlY2U1NzM5YmIyZjNi
NmQwM2RkOGNhOGUzMzViZjBkMTI2ODcgTW9uIFNlcCAxNyAwMDowMDowMCAyMDAxDQpGcm9tOiBU
cm9uZCBNeWtsZWJ1c3QgPHRyb25kLm15a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20+DQpEYXRlOiBU
aHUsIDggT2N0IDIwMTUgMDg6NDQ6MDAgLTA0MDANClN1YmplY3Q6IFtQQVRDSF0gbmFtZWk6IHJl
c3VsdHMgb2YgZF9pc19uZWdhdGl2ZSgpIHNob3VsZCBiZSBjaGVja2VkIGFmdGVyICBkZW50cnkg
cmV2YWxpZGF0aW9uDQoNCkxlYW5kcm8gQXdhIHdyaXRlczoNCkFmdGVyIHN3aXRjaGluZyB0byB2
ZXJzaW9uIDQuMS42LCBvdXIgcGFyYWxsZWxpemVkIGFuZCBkaXN0cmlidXRlZCB3b3JrZmxvd3Mg
bm93ICBmYWlsIGNvbnNpc3RlbnRseSB3aXRoIGVycm9ycyBvZiB0aGUgZm9ybToNCg0KVDM0OiAu
L3JlZ2V4LmM6Mzk6MjI6IGVycm9yOiBjb25maWcuaDogTm8gc3VjaCBmaWxlIG9yIGRpcmVjdG9y
eQ0KDQpGcm9tIG91ciAnZ2l0IGJpc2VjdCcgdGVzdGluZywgdGhlIGZvbGxvd2luZyBjb21taXQg
YXBwZWFycyB0byBiZSB0aGUgcG9zc2libGUgY2F1c2Ugb2YgdGhlIGJlaGF2aW9yIHdlJ3ZlIGJl
ZW4gc2VlaW5nOiBjb21taXQgNzY2YzRjYmZhY2Q4DQoNClRoZSBpc3N1ZSBpcyB0aGF0IHJldmFs
aWRhdGlvbiBtYXkgY2F1c2UgdGhlIGRlbnRyeSB0byBiZSBkcm9wcGVkIGluIE5GUyBpZiwgc2F5
LCB0aGUgY2xpZW50IG5vdGVzIHRoYXQgdGhlIGRpcmVjdG9yeSB0aW1lc3RhbXBzIGhhdmUgY2hh
bmdlZC4NCg0KUmVwb3J0ZWQtYnk6IExlYW5kcm8gQXdhIDxsYXdhQG52aWRpYS5jb20+DQpMaW5r
OiBodHRwczovL2J1Z3ppbGxhLmtlcm5lbC5vcmcvc2hvd19idWcuY2dpP2lkPTEwNDkxMQ0KRml4
ZXM6IDc2NmM0Y2JmYWNkOCAoIm5hbWVpOiBkX2lzX25lZ2F0aXZlKCkgc2hvdWxkIGJlIGNoZWNr
ZWQuLi4iKQ0KQ2M6IHN0YWJsZUB2Z2VyLmtlcm5lbC5vcmcgIyB2NC4xKw0KU2lnbmVkLW9mZi1i
eTogVHJvbmQgTXlrbGVidXN0IDx0cm9uZC5teWtsZWJ1c3RAcHJpbWFyeWRhdGEuY29tPg0KLS0t
DQogZnMvbmFtZWkuYyB8IDggKysrKysrLS0NCiAxIGZpbGUgY2hhbmdlZCwgNiBpbnNlcnRpb25z
KCspLCAyIGRlbGV0aW9ucygtKQ0KDQpkaWZmIC0tZ2l0IGEvZnMvbmFtZWkuYyBiL2ZzL25hbWVp
LmMNCmluZGV4IDcyNmQyMTFkYjQ4NC4uMzNlOTQ5NWEzMTI5IDEwMDY0NA0KLS0tIGEvZnMvbmFt
ZWkuYw0KKysrIGIvZnMvbmFtZWkuYw0KQEAgLTE1NTgsOCArMTU1OCw2IEBAIHN0YXRpYyBpbnQg
bG9va3VwX2Zhc3Qoc3RydWN0IG5hbWVpZGF0YSAqbmQsDQogCQluZWdhdGl2ZSA9IGRfaXNfbmVn
YXRpdmUoZGVudHJ5KTsNCiAJCWlmIChyZWFkX3NlcWNvdW50X3JldHJ5KCZkZW50cnktPmRfc2Vx
LCBzZXEpKQ0KIAkJCXJldHVybiAtRUNISUxEOw0KLQkJaWYgKG5lZ2F0aXZlKQ0KLQkJCXJldHVy
biAtRU5PRU5UOw0KIA0KIAkJLyoNCiAJCSAqIFRoaXMgc2VxdWVuY2UgY291bnQgdmFsaWRhdGVz
IHRoYXQgdGhlIHBhcmVudCBoYWQgbm8gQEAgLTE1ODAsNiArMTU3OCwxMiBAQCBzdGF0aWMgaW50
IGxvb2t1cF9mYXN0KHN0cnVjdCBuYW1laWRhdGEgKm5kLA0KIAkJCQlnb3RvIHVubGF6eTsNCiAJ
CQl9DQogCQl9DQorCQkvKg0KKwkJICogTm90ZTogZG8gbmVnYXRpdmUgZGVudHJ5IGNoZWNrIGFm
dGVyIHJldmFsaWRhdGlvbiBpbg0KKwkJICogY2FzZSB0aGF0IGRyb3BzIGl0Lg0KKwkJICovDQor
CQlpZiAobmVnYXRpdmUpDQorCQkJcmV0dXJuIC1FTk9FTlQ7DQogCQlwYXRoLT5tbnQgPSBtbnQ7
DQogCQlwYXRoLT5kZW50cnkgPSBkZW50cnk7DQogCQlpZiAobGlrZWx5KF9fZm9sbG93X21vdW50
X3JjdShuZCwgcGF0aCwgaW5vZGUsIHNlcXApKSkNCi0tDQoyLjQuMw0KDQotLQ0KVHJvbmQgTXlr
bGVidXN0DQpMaW51eCBORlMgY2xpZW50IG1haW50YWluZXIsIFByaW1hcnlEYXRhIHRyb25kLm15
a2xlYnVzdEBwcmltYXJ5ZGF0YS5jb20NCg0KDQoNCg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0NClRoaXMgZW1haWwgbWVzc2FnZSBpcyBmb3IgdGhlIHNvbGUgdXNlIG9mIHRoZSBpbnRlbmRl
ZCByZWNpcGllbnQocykgYW5kIG1heSBjb250YWluDQpjb25maWRlbnRpYWwgaW5mb3JtYXRpb24u
ICBBbnkgdW5hdXRob3JpemVkIHJldmlldywgdXNlLCBkaXNjbG9zdXJlIG9yIGRpc3RyaWJ1dGlv
bg0KaXMgcHJvaGliaXRlZC4gIElmIHlvdSBhcmUgbm90IHRoZSBpbnRlbmRlZCByZWNpcGllbnQs
IHBsZWFzZSBjb250YWN0IHRoZSBzZW5kZXIgYnkNCnJlcGx5IGVtYWlsIGFuZCBkZXN0cm95IGFs
bCBjb3BpZXMgb2YgdGhlIG9yaWdpbmFsIG1lc3NhZ2UuDQotLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLQ0K

2015-10-09 17:44:47

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

Leandro Awa writes:
After switching to version 4.1.6, our parallelized and distributed workflows now fail consistently with errors of the form:

T34: ./regex.c:39:22: error: config.h: No such file or directory

2015-10-10 00:19:03

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

On Fri, Oct 9, 2015 at 10:44 AM, Trond Myklebust
<[email protected]> wrote:
>
> The issue is that revalidation may cause the dentry to be dropped in NFS
> if, say, the client notes that the directory timestamps have changed.

Ack.

We've had this bug before, where we returned something else than
-ENOCHLD while we were doing RCU lookups. See for example commit
97242f99a013 ("link_path_walk(): be careful when failing with
ENOTDIR").

So in general, we should always (a) either verify all sequence points
or (b) return -ENOCHLD to go into slow mode. The patch seems

However, this thing was explicitly made to be this way by commit
766c4cbfacd8 ("namei: d_is_negative() should be checked before ->d_seq
validation"), so while my gut feel is to consider this fix
ObviouslyCorrect(tm), I will delay it a bit in the hope to get an ACK
and comment from Al about the patch.

Al?

Linus

2015-10-10 01:37:03

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

On Fri, Oct 09, 2015 at 05:19:02PM -0700, Linus Torvalds wrote:

> So in general, we should always (a) either verify all sequence points
> or (b) return -ENOCHLD to go into slow mode. The patch seems
>
> However, this thing was explicitly made to be this way by commit
> 766c4cbfacd8 ("namei: d_is_negative() should be checked before ->d_seq
> validation"), so while my gut feel is to consider this fix
> ObviouslyCorrect(tm), I will delay it a bit in the hope to get an ACK
> and comment from Al about the patch.
>
> Al?

Umm... I agree that the current version is wrong and it looks like this
patch is a complete fix. The only problem is the commit message -
what really happens is that 766c4cbfacd8 got the things subtly wrong.
We used to treat d_is_negative() after lookup_fast() as "fall with ENOENT".
That was wrong - checking ->d_flags outside of ->d_seq protection is
unreliable and failing with hard error on what should've fallen back to
non-RCU pathname resolution is a bug.

Unfortunately, we'd pulled the test too far up and ran afoul of another
kind of staleness. Dentry might have been absolutely stable from the
RCU point of view (and we might be on UP, etc.), but stale from the
remote fs point of view. If ->d_revalidate() returns "it's actually
stale", dentry gets thrown away and original code wouldn't even have looked
at its ->d_flags. What we need is to check ->d_flags where 766c4cbfacd8 does
(prior to ->d_seq validation) but only use the result in cases where we
do not discard this dentry outright.

With some explanation along the lines of the above added, consider the patch
ACKed.

2015-10-10 17:13:20

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

On Sat, Oct 10, 2015 at 02:36:57AM +0100, Al Viro wrote:
> On Fri, Oct 09, 2015 at 05:19:02PM -0700, Linus Torvalds wrote:
>
> > So in general, we should always (a) either verify all sequence points
> > or (b) return -ENOCHLD to go into slow mode. The patch seems
> >
> > However, this thing was explicitly made to be this way by commit
> > 766c4cbfacd8 ("namei: d_is_negative() should be checked before ->d_seq
> > validation"), so while my gut feel is to consider this fix
> > ObviouslyCorrect(tm), I will delay it a bit in the hope to get an ACK
> > and comment from Al about the patch.
> >
> > Al?
>
> Umm... I agree that the current version is wrong and it looks like this
> patch is a complete fix. The only problem is the commit message -
> what really happens is that 766c4cbfacd8 got the things subtly wrong.
> We used to treat d_is_negative() after lookup_fast() as "fall with ENOENT".
> That was wrong - checking ->d_flags outside of ->d_seq protection is
> unreliable and failing with hard error on what should've fallen back to
> non-RCU pathname resolution is a bug.
>
> Unfortunately, we'd pulled the test too far up and ran afoul of another
> kind of staleness. Dentry might have been absolutely stable from the
> RCU point of view (and we might be on UP, etc.), but stale from the
> remote fs point of view. If ->d_revalidate() returns "it's actually
> stale", dentry gets thrown away and original code wouldn't even have looked
> at its ->d_flags. What we need is to check ->d_flags where 766c4cbfacd8 does
> (prior to ->d_seq validation) but only use the result in cases where we
> do not discard this dentry outright.
>
> With some explanation along the lines of the above added, consider the patch
> ACKed.

OK, I've attemtped to add an explanation of what's going on; please, pull from
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Shortlog:
Trond Myklebust (1):
namei: results of d_is_negative() should be acted upon only after dentry revalidation

Diffstat:
fs/namei.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)


2015-10-10 17:19:34

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

On Sat, Oct 10, 2015 at 10:13 AM, Al Viro <[email protected]> wrote:
>
> OK, I've attemtped to add an explanation of what's going on; please, pull from
..

Heh. I just committed it myself with your emailed explanation, so ..

Linus