2019-03-06 00:44:24

by Christian Kujau

[permalink] [raw]
Subject: FS-Cache: Duplicate cookie detected

Hi,

ever since ec0328e46d6e ("fscache: Maintain a catalogue of allocated
cookies") was commited, people are seeing[0] those "Duplicate cookie
detected" messages in syslog, see below. NFS and CIFS mounts appear to
continue to work, but these messsages are new and I too am wondering if
this is something to worry about.

They _are_ logged with pr_err in fs/fscache/cookie.c, but maybe this needs
to be changed to a different loglevel?

Thanks,
Christian.

FS-Cache: Duplicate cookie detected
FS-Cache: O-cookie c=000000009da9dbf0 [p=000000001593f904 fl=222 nc=0 na=1]
FS-Cache: O-cookie d=00000000287febd9 n=00000000980c9e8a
FS-Cache: O-key=[8] '020001bdc0a80064'
FS-Cache: N-cookie c=00000000bfe3f869 [p=000000001593f904 fl=2 nc=0 na=1]
FS-Cache: N-cookie d=00000000287febd9 n=00000000e153f178
FS-Cache: N-key=[8] '020001bdc0a80064'


[0] https://bugzilla.kernel.org/show_bug.cgi?id=200145
--
BOFH excuse #318:

Your EMAIL is now being delivered by the USPS.


2019-03-06 17:02:45

by David Howells

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

Christian Kujau <[email protected]> wrote:

> FS-Cache: Duplicate cookie detected
> FS-Cache: O-cookie c=000000009da9dbf0 [p=000000001593f904 fl=222 nc=0 na=1]
> FS-Cache: O-cookie d=00000000287febd9 n=00000000980c9e8a
> FS-Cache: O-key=[8] '020001bdc0a80064'
> FS-Cache: N-cookie c=00000000bfe3f869 [p=000000001593f904 fl=2 nc=0 na=1]
> FS-Cache: N-cookie d=00000000287febd9 n=00000000e153f178
> FS-Cache: N-key=[8] '020001bdc0a80064'

I've just spotted something that all these things have in common. The old
cookie has FSCACHE_COOKIE_RELINQUISHING set in the flags (fl=222). I don't
have a wait for this situation in fscache - which I should.

Interestingly, '020001bdc0a80064' looks like is could be the key from the
server level, assuming this is from NFS:

0200 01bdc0a8 0064

could be:

AF_INET (LE), 1.189.192.168, 100 (BE)

David

2019-03-06 18:34:53

by David Howells

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

David Howells <[email protected]> wrote:

> > FS-Cache: Duplicate cookie detected
> > FS-Cache: O-cookie c=000000009da9dbf0 [p=000000001593f904 fl=222 nc=0 na=1]
> > FS-Cache: O-cookie d=00000000287febd9 n=00000000980c9e8a
> > FS-Cache: O-key=[8] '020001bdc0a80064'
> > FS-Cache: N-cookie c=00000000bfe3f869 [p=000000001593f904 fl=2 nc=0 na=1]
> > FS-Cache: N-cookie d=00000000287febd9 n=00000000e153f178
> > FS-Cache: N-key=[8] '020001bdc0a80064'
>
> I've just spotted something that all these things have in common. The old
> cookie has FSCACHE_COOKIE_RELINQUISHING set in the flags (fl=222). I don't
> have a wait for this situation in fscache - which I should.

I take it back - 0x200 is FSCACHE_COOKIE_ACQUIRED. :-/

I can reproduce a slightly different problem by setting off ~6000 parallel
processes, each reading its own individual directory of files.

I also see reports like this:

FS-Cache: Duplicate cookie detected
FS-Cache: O-cookie c=00000000db33ad59 [p=000000004bc53500 fl=218 nc=0 na=0]
FS-Cache: O-cookie d= (null) n= (null)
FS-Cache: O-cookie o=000000006cf6db4f
FS-Cache: O-key=[16] '0100010101000000e51fc6000323ae02'
FS-Cache: N-cookie c=00000000791c49d0 [p=000000004bc53500 fl=2 nc=0 na=1]
FS-Cache: N-cookie d=00000000e220fe14 n=00000000d4484489
FS-Cache: N-key=[16] '0100010101000000e51fc6000323ae02'

with no cookie def or netfs data and flags ACQUIRED, RELINQUISHED and
INVALIDATING - which I can insert a wait for.

David

2019-03-06 20:19:54

by Christian Kujau

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

On Wed, 6 Mar 2019, David Howells wrote:
> I can reproduce a slightly different problem by setting off ~6000 parallel
> processes, each reading its own individual directory of files.

Ususually I only see it shortly after mount, and only once, but I too can
reproduce it with NFStest ([0], and there's a Fedora package too) via
"nfstest_cache --server $SERVER --client `hostname`", which then produces
a couple of these messages:

FS-Cache: Duplicate cookie detected
FS-Cache: O-cookie c=000000002fcc866b [p=00000000c10c6e18 fl=222 nc=0 na=1]
FS-Cache: O-cookie d=00000000d5ed73bb n=00000000076c9150
FS-Cache: O-key=[10] '040002000801c0a80073'
FS-Cache: N-cookie c=00000000e8d5dcd4 [p=00000000c10c6e18 fl=2 nc=0 na=1]
FS-Cache: N-cookie d=00000000d5ed73bb n=00000000a54e9705
FS-Cache: N-key=[10] '040002000801c0a80073'


...and the O-key does indeed seem to resemble the server address,
somewhat:

>>> s = "040002000801c0a80073";
>>> bytes = ["".join(x) for x in zip(*[iter(s)]*2)]; bytes = [int(x, 16) for x in bytes]; print ".".join(str(x) for x in reversed(bytes))
115.0.168.192.1.8.0.2.0.4
^^^^^^^^^^^^^

Mount options on that client are:

$ mount | grep nfs4
nfs:/usr/local/src on /usr/local/src type nfs4 (ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,local_lock=none,addr=192.168.0.115)

...so FS-Cache ("fsc") isn't even used here. The server in that scenario
is a current Fedora 29 installation.

HTH,
Christian.

[0] http://wiki.linux-nfs.org/wiki/index.php/NFStest

>
> I also see reports like this:
>
> FS-Cache: Duplicate cookie detected
> FS-Cache: O-cookie c=00000000db33ad59 [p=000000004bc53500 fl=218 nc=0 na=0]
> FS-Cache: O-cookie d= (null) n= (null)
> FS-Cache: O-cookie o=000000006cf6db4f
> FS-Cache: O-key=[16] '0100010101000000e51fc6000323ae02'
> FS-Cache: N-cookie c=00000000791c49d0 [p=000000004bc53500 fl=2 nc=0 na=1]
> FS-Cache: N-cookie d=00000000e220fe14 n=00000000d4484489
> FS-Cache: N-key=[16] '0100010101000000e51fc6000323ae02'
>
> with no cookie def or netfs data and flags ACQUIRED, RELINQUISHED and
> INVALIDATING - which I can insert a wait for.
>
> David

--
BOFH excuse #420:

Feature was not beta tested

2019-03-08 14:50:07

by David Howells

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

Christian Kujau <[email protected]> wrote:

> $ mount | grep nfs4
> nfs:/usr/local/src on /usr/local/src type nfs4 (ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,local_lock=none,addr=192.168.0.115)
>
> ...so FS-Cache ("fsc") isn't even used here.

Interesting. Can you do:

cat /proc/fs/nfsfs/volumes

See the attached for a patch that helps with certain kinds of collision,
though I can't see that it should help with what you're seeing since the
RELINQUISHED flag isn't set on the old cookie (fl=222, but 0x10 isn't in
there). You can monitor the number of waits by looking in
/proc/fs/fscache/stats for the:

Acquire: n=289166 nul=0 noc=0 ok=286331 nbf=2 oom=0 wrq=23748

line. See the wrq= value.

David
---
commit abe600bc6e5022e14c0ce135fa3d9ceaafa3cb46
Author: David Howells <[email protected]>
Date: Fri Mar 8 13:14:36 2019 +0000

fscache: Fix cookie collision

When adding a new cookie, a collision can sometimes occur because an old
cookie with the same key is still in the process of being asynchronously
relinquished, leading to something like the following being emitted into
the kernel log:

FS-Cache: Duplicate cookie detected
FS-Cache: O-cookie c=00000000db33ad59 [p=000000004bc53500 fl=218 nc=0 na=0]
FS-Cache: O-cookie d= (null) n= (null)
FS-Cache: O-cookie o=000000006cf6db4f
FS-Cache: O-key=[16] '0100010101000000e51fc6000323ae02'
FS-Cache: N-cookie c=00000000791c49d0 [p=000000004bc53500 fl=2 nc=0 na=1]
FS-Cache: N-cookie d=00000000e220fe14 n=00000000d4484489
FS-Cache: N-key=[16] '0100010101000000e51fc6000323ae02'

with the old cookie (O- lines) showing no cookie def or netfs data and
showing flags ACQUIRED, RELINQUISHED and INVALIDATING (fl=218).

Fix this by:

(1) Setting FSCACHE_COOKIE_RELINQUISHING on the cookie we're about to tear
dispose of.

(2) Making fscache_hash_cookie() sleep on RELINQUISHING if there's a
collision with a cookie that has RELINQUISHED set.

(3) Clearing RELINQUISHING upon having unhashed the cookie and waking up
anyone waiting for it to transition to unset.

Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
Signed-off-by: David Howells <[email protected]>

diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 50f0a5757f48..f304bda88bb1 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -231,6 +231,7 @@ proc files.
ok=N Number of acq reqs succeeded
nbf=N Number of acq reqs rejected due to error
oom=N Number of acq reqs failed on ENOMEM
+ wrq=N Number of waits for relinquishment of conflicting old cookies.
Lookups n=N Number of lookup calls made on cache backends
neg=N Number of negative lookups made
pos=N Number of positive lookups made
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index c550512ce335..75d0ffd36ac0 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -202,7 +202,9 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
struct hlist_bl_head *h;
struct hlist_bl_node *p;
unsigned int bucket;
+ int ret;

+retry:
bucket = candidate->key_hash & (ARRAY_SIZE(fscache_cookie_hash) - 1);
h = &fscache_cookie_hash[bucket];

@@ -220,19 +222,37 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
return candidate;

collision:
- if (test_and_set_bit(FSCACHE_COOKIE_ACQUIRED, &cursor->flags)) {
- trace_fscache_cookie(cursor, fscache_cookie_collision,
- atomic_read(&cursor->usage));
- pr_err("Duplicate cookie detected\n");
- fscache_print_cookie(cursor, 'O');
- fscache_print_cookie(candidate, 'N');
- hlist_bl_unlock(h);
- return NULL;
- }
+ if (test_and_set_bit(FSCACHE_COOKIE_ACQUIRED, &cursor->flags))
+ goto duplicate;

fscache_cookie_get(cursor, fscache_cookie_get_reacquire);
hlist_bl_unlock(h);
return cursor;
+
+duplicate:
+ if (test_bit(FSCACHE_COOKIE_RELINQUISHED, &cursor->flags))
+ goto wait_for_removal;
+
+ trace_fscache_cookie(cursor, fscache_cookie_collision,
+ atomic_read(&cursor->usage));
+ pr_err("Duplicate cookie detected\n");
+ fscache_print_cookie(cursor, 'O');
+ fscache_print_cookie(candidate, 'N');
+ hlist_bl_unlock(h);
+ return NULL;
+
+wait_for_removal:
+ fscache_cookie_get(cursor, fscache_cookie_get_wait);
+ hlist_bl_unlock(h);
+
+ fscache_stat(&fscache_n_acquires_wait_relinq);
+ ret = wait_on_bit(&cursor->flags, FSCACHE_COOKIE_RELINQUISHING,
+ TASK_INTERRUPTIBLE);
+
+ fscache_cookie_put(cursor, fscache_cookie_put_wait);
+ if (ret < 0)
+ return NULL;
+ goto retry;
}

/*
@@ -816,9 +836,16 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
/* No further netfs-accessing operations on this cookie permitted */
if (test_and_set_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags))
BUG();
+ set_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags);

__fscache_disable_cookie(cookie, aux_data, retire);

+ spin_lock(&cookie->lock);
+ if (test_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags) &&
+ hlist_empty(&cookie->backing_objects))
+ fscache_unhash_cookie(cookie);
+ spin_unlock(&cookie->lock);
+
/* Clear pointers back to the netfs */
cookie->netfs_data = NULL;
cookie->def = NULL;
@@ -839,19 +866,24 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
EXPORT_SYMBOL(__fscache_relinquish_cookie);

/*
- * Remove a cookie from the hash table.
+ * Remove a cookie from the hash table and wake up anyone waiting on this.
*/
-static void fscache_unhash_cookie(struct fscache_cookie *cookie)
+void fscache_unhash_cookie(struct fscache_cookie *cookie)
{
struct hlist_bl_head *h;
unsigned int bucket;

+ _enter("%p", cookie);
+
bucket = cookie->key_hash & (ARRAY_SIZE(fscache_cookie_hash) - 1);
h = &fscache_cookie_hash[bucket];

hlist_bl_lock(h);
- hlist_bl_del(&cookie->hash_link);
+ hlist_bl_del_init(&cookie->hash_link);
hlist_bl_unlock(h);
+
+ if (test_and_clear_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags))
+ wake_up_bit(&cookie->flags, FSCACHE_COOKIE_RELINQUISHING);
}

/*
@@ -873,8 +905,8 @@ void fscache_cookie_put(struct fscache_cookie *cookie,
return;
BUG_ON(usage < 0);

+ ASSERT(hlist_bl_unhashed(&cookie->hash_link));
parent = cookie->parent;
- fscache_unhash_cookie(cookie);
fscache_free_cookie(cookie);

cookie = parent;
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index d6209022e965..caf0df77e1b1 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -57,6 +57,7 @@ extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
const void *, size_t,
void *, loff_t);
extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *);
+extern void fscache_unhash_cookie(struct fscache_cookie *);
extern void fscache_cookie_put(struct fscache_cookie *,
enum fscache_cookie_trace);

@@ -227,6 +228,7 @@ extern atomic_t fscache_n_acquires_no_cache;
extern atomic_t fscache_n_acquires_ok;
extern atomic_t fscache_n_acquires_nobufs;
extern atomic_t fscache_n_acquires_oom;
+extern atomic_t fscache_n_acquires_wait_relinq;

extern atomic_t fscache_n_invalidates;
extern atomic_t fscache_n_invalidates_run;
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 6d9cb1719de5..d16737e0307a 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -706,7 +706,7 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob
struct fscache_object *parent = object->parent;
struct fscache_cookie *cookie = object->cookie;
struct fscache_cache *cache = object->cache;
- bool awaken = false;
+ int awaken = 0;

_enter("{OBJ%x,%d},%d", object->debug_id, object->n_children, event);

@@ -723,13 +723,18 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob
*/
spin_lock(&cookie->lock);
hlist_del_init(&object->cookie_link);
- if (hlist_empty(&cookie->backing_objects) &&
- test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
- awaken = true;
+ if (hlist_empty(&cookie->backing_objects)) {
+ if (test_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags))
+ awaken = 2;
+ else if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
+ awaken = 1;
+ }
spin_unlock(&cookie->lock);

if (awaken)
wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
+ if (awaken == 2)
+ fscache_unhash_cookie(cookie);
if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);

diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 00564a1dfd76..ca53e84cb7ea 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -80,6 +80,7 @@ atomic_t fscache_n_acquires_no_cache;
atomic_t fscache_n_acquires_ok;
atomic_t fscache_n_acquires_nobufs;
atomic_t fscache_n_acquires_oom;
+atomic_t fscache_n_acquires_wait_relinq;

atomic_t fscache_n_invalidates;
atomic_t fscache_n_invalidates_run;
@@ -163,13 +164,14 @@ int fscache_stats_show(struct seq_file *m, void *v)
atomic_read(&fscache_n_uncaches));

seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u nbf=%u"
- " oom=%u\n",
+ " oom=%u wrq=%u\n",
atomic_read(&fscache_n_acquires),
atomic_read(&fscache_n_acquires_null),
atomic_read(&fscache_n_acquires_no_cache),
atomic_read(&fscache_n_acquires_ok),
atomic_read(&fscache_n_acquires_nobufs),
- atomic_read(&fscache_n_acquires_oom));
+ atomic_read(&fscache_n_acquires_oom),
+ atomic_read(&fscache_n_acquires_wait_relinq));

seq_printf(m, "Lookups: n=%u neg=%u pos=%u crt=%u tmo=%u\n",
atomic_read(&fscache_n_object_lookups),
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 686cfe997ed2..d9a969642da7 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -30,11 +30,13 @@ enum fscache_cookie_trace {
fscache_cookie_get_attach_object,
fscache_cookie_get_reacquire,
fscache_cookie_get_register_netfs,
+ fscache_cookie_get_wait,
fscache_cookie_put_acquire_nobufs,
fscache_cookie_put_dup_netfs,
fscache_cookie_put_relinquish,
fscache_cookie_put_object,
fscache_cookie_put_parent,
+ fscache_cookie_put_wait,
};

enum fscache_page_trace {
@@ -96,11 +98,13 @@ enum fscache_page_op_trace {
EM(fscache_cookie_get_attach_object, "GET obj") \
EM(fscache_cookie_get_reacquire, "GET raq") \
EM(fscache_cookie_get_register_netfs, "GET net") \
+ EM(fscache_cookie_get_wait, "GET wai") \
EM(fscache_cookie_put_acquire_nobufs, "PUT nbf") \
EM(fscache_cookie_put_dup_netfs, "PUT dnt") \
EM(fscache_cookie_put_relinquish, "PUT rlq") \
EM(fscache_cookie_put_object, "PUT obj") \
- E_(fscache_cookie_put_parent, "PUT prn")
+ EM(fscache_cookie_put_parent, "PUT prn") \
+ E_(fscache_cookie_put_wait, "PUT wai")

#define fscache_page_traces \
EM(fscache_page_cached, "Cached ") \

2019-03-08 17:57:56

by Christian Kujau

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

On Fri, 8 Mar 2019, David Howells wrote:
> > $ mount | grep nfs4
> > nfs:/usr/local/src on /usr/local/src type nfs4 (ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,local_lock=none,addr=192.168.0.115)
> >
> > ...so FS-Cache ("fsc") isn't even used here.
>
> Interesting. Can you do:
>
> cat /proc/fs/nfsfs/volumes

That seems to confirm the mount options, fsc is disabled:

# cat /proc/fs/nfsfs/volumes
NV SERVER PORT DEV FSID FSC
v4 c0a80073 801 0:46 1cfd45bf1921474d:a795870ea80f5ff7 no

> See the attached for a patch that helps with certain kinds of collision,
> though I can't see that it should help with what you're seeing since the
> RELINQUISHED flag isn't set on the old cookie (fl=222, but 0x10 isn't in
> there). You can monitor the number of waits by looking in
> /proc/fs/fscache/stats for the:
>
> Acquire: n=289166 nul=0 noc=0 ok=286331 nbf=2 oom=0 wrq=23748

Ah, the wrq= field gets only introduced by this patch. OK, I'll see if I
can build a test kernel with that and will report back.

Thanks for looking in to this,
Christian.
--
BOFH excuse #290:

The CPU has shifted, and become decentralized.

2019-03-08 23:16:11

by Christian Kujau

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

On Fri, 8 Mar 2019, David Howells wrote:
> See the attached for a patch that helps with certain kinds of collision,
> though I can't see that it should help with what you're seeing since the
> RELINQUISHED flag isn't set on the old cookie (fl=222, but 0x10 isn't in
> there). You can monitor the number of waits by looking in
> /proc/fs/fscache/stats for the:
>
> Acquire: n=289166 nul=0 noc=0 ok=286331 nbf=2 oom=0 wrq=23748

Running Linux v5.0 with this patch applied does indeed still produce the
"Duplicate cookie detected" messages, but I only ever see wrq=0 when
running nfstest_cache:

https://paste.fedoraproject.org/paste/dkav0FQzYZxE9-V7GphjAQ

(Scroll down until the messages start to appear again)

Only the n= field seems to change during that test:


fedora0# grep wrq n2.log | sort | uniq -c | sort -n
28 Acquire: n=8 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0
29 Acquire: n=7 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0
34 Acquire: n=6 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0
82 Acquire: n=9 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0
93 Acquire: n=5 nul=0 noc=0 ok=1 nbf=0 oom=0 wrq=0

HTH,
Christian.
--
BOFH excuse #5:

static from plastic slide rules

2019-03-08 23:26:56

by Christian Kujau

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

On Fri, 8 Mar 2019, Christian Kujau wrote:
> Running Linux v5.0 with this patch applied does indeed still produce the
> "Duplicate cookie detected" messages, but I only ever see wrq=0 when
> running nfstest_cache:
>
> https://paste.fedoraproject.org/paste/dkav0FQzYZxE9-V7GphjAQ

And again with the whole /proc/fs/fscache/stats output and better time
stamps: https://paste.fedoraproject.org/paste/hZtCPStJlqB1d9JXnTFndQ

C
--
BOFH excuse #5:

static from plastic slide rules

2019-03-11 14:50:46

by David Howells

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

Hi Christian,

I've a couple more patches for you - one a bugfix and one that will print more
information. They don't actually affect the problem you're seeing. I'll post
them as replies to this message.

David

2019-03-11 14:51:28

by David Howells

[permalink] [raw]
Subject: [PATCH 1/3] fscache: Fix cookie collision

When adding a new cookie, a collision can sometimes occur because an old
cookie with the same key is still in the process of being asynchronously
relinquished, leading to something like the following being emitted into
the kernel log:

FS-Cache: Duplicate cookie detected
FS-Cache: O-cookie c=00000000db33ad59 [p=000000004bc53500 fl=218 nc=0 na=0]
FS-Cache: O-cookie d= (null) n= (null)
FS-Cache: O-cookie o=000000006cf6db4f
FS-Cache: O-key=[16] '0100010101000000e51fc6000323ae02'
FS-Cache: N-cookie c=00000000791c49d0 [p=000000004bc53500 fl=2 nc=0 na=1]
FS-Cache: N-cookie d=00000000e220fe14 n=00000000d4484489
FS-Cache: N-key=[16] '0100010101000000e51fc6000323ae02'

with the old cookie (O- lines) showing no cookie def or netfs data and
showing flags ACQUIRED, RELINQUISHED and INVALIDATING (fl=218).

Fix this by:

(1) Setting FSCACHE_COOKIE_RELINQUISHING on the cookie we're about to tear
dispose of.

(2) Making fscache_hash_cookie() sleep on RELINQUISHING if there's a
collision with a cookie that has RELINQUISHED set.

(3) Clearing RELINQUISHING upon having unhashed the cookie and waking up
anyone waiting for it to transition to unset.

Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
Signed-off-by: David Howells <[email protected]>
---

Documentation/filesystems/caching/fscache.txt | 1
fs/fscache/cookie.c | 58 +++++++++++++++++++------
fs/fscache/internal.h | 2 +
fs/fscache/object.c | 13 ++++--
fs/fscache/stats.c | 6 ++-
include/trace/events/fscache.h | 6 ++-
6 files changed, 66 insertions(+), 20 deletions(-)

diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 50f0a5757f48..f304bda88bb1 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -231,6 +231,7 @@ proc files.
ok=N Number of acq reqs succeeded
nbf=N Number of acq reqs rejected due to error
oom=N Number of acq reqs failed on ENOMEM
+ wrq=N Number of waits for relinquishment of conflicting old cookies.
Lookups n=N Number of lookup calls made on cache backends
neg=N Number of negative lookups made
pos=N Number of positive lookups made
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index c550512ce335..75d0ffd36ac0 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -202,7 +202,9 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
struct hlist_bl_head *h;
struct hlist_bl_node *p;
unsigned int bucket;
+ int ret;

+retry:
bucket = candidate->key_hash & (ARRAY_SIZE(fscache_cookie_hash) - 1);
h = &fscache_cookie_hash[bucket];

@@ -220,19 +222,37 @@ struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *candidate)
return candidate;

collision:
- if (test_and_set_bit(FSCACHE_COOKIE_ACQUIRED, &cursor->flags)) {
- trace_fscache_cookie(cursor, fscache_cookie_collision,
- atomic_read(&cursor->usage));
- pr_err("Duplicate cookie detected\n");
- fscache_print_cookie(cursor, 'O');
- fscache_print_cookie(candidate, 'N');
- hlist_bl_unlock(h);
- return NULL;
- }
+ if (test_and_set_bit(FSCACHE_COOKIE_ACQUIRED, &cursor->flags))
+ goto duplicate;

fscache_cookie_get(cursor, fscache_cookie_get_reacquire);
hlist_bl_unlock(h);
return cursor;
+
+duplicate:
+ if (test_bit(FSCACHE_COOKIE_RELINQUISHED, &cursor->flags))
+ goto wait_for_removal;
+
+ trace_fscache_cookie(cursor, fscache_cookie_collision,
+ atomic_read(&cursor->usage));
+ pr_err("Duplicate cookie detected\n");
+ fscache_print_cookie(cursor, 'O');
+ fscache_print_cookie(candidate, 'N');
+ hlist_bl_unlock(h);
+ return NULL;
+
+wait_for_removal:
+ fscache_cookie_get(cursor, fscache_cookie_get_wait);
+ hlist_bl_unlock(h);
+
+ fscache_stat(&fscache_n_acquires_wait_relinq);
+ ret = wait_on_bit(&cursor->flags, FSCACHE_COOKIE_RELINQUISHING,
+ TASK_INTERRUPTIBLE);
+
+ fscache_cookie_put(cursor, fscache_cookie_put_wait);
+ if (ret < 0)
+ return NULL;
+ goto retry;
}

/*
@@ -816,9 +836,16 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
/* No further netfs-accessing operations on this cookie permitted */
if (test_and_set_bit(FSCACHE_COOKIE_RELINQUISHED, &cookie->flags))
BUG();
+ set_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags);

__fscache_disable_cookie(cookie, aux_data, retire);

+ spin_lock(&cookie->lock);
+ if (test_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags) &&
+ hlist_empty(&cookie->backing_objects))
+ fscache_unhash_cookie(cookie);
+ spin_unlock(&cookie->lock);
+
/* Clear pointers back to the netfs */
cookie->netfs_data = NULL;
cookie->def = NULL;
@@ -839,19 +866,24 @@ void __fscache_relinquish_cookie(struct fscache_cookie *cookie,
EXPORT_SYMBOL(__fscache_relinquish_cookie);

/*
- * Remove a cookie from the hash table.
+ * Remove a cookie from the hash table and wake up anyone waiting on this.
*/
-static void fscache_unhash_cookie(struct fscache_cookie *cookie)
+void fscache_unhash_cookie(struct fscache_cookie *cookie)
{
struct hlist_bl_head *h;
unsigned int bucket;

+ _enter("%p", cookie);
+
bucket = cookie->key_hash & (ARRAY_SIZE(fscache_cookie_hash) - 1);
h = &fscache_cookie_hash[bucket];

hlist_bl_lock(h);
- hlist_bl_del(&cookie->hash_link);
+ hlist_bl_del_init(&cookie->hash_link);
hlist_bl_unlock(h);
+
+ if (test_and_clear_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags))
+ wake_up_bit(&cookie->flags, FSCACHE_COOKIE_RELINQUISHING);
}

/*
@@ -873,8 +905,8 @@ void fscache_cookie_put(struct fscache_cookie *cookie,
return;
BUG_ON(usage < 0);

+ ASSERT(hlist_bl_unhashed(&cookie->hash_link));
parent = cookie->parent;
- fscache_unhash_cookie(cookie);
fscache_free_cookie(cookie);

cookie = parent;
diff --git a/fs/fscache/internal.h b/fs/fscache/internal.h
index d6209022e965..caf0df77e1b1 100644
--- a/fs/fscache/internal.h
+++ b/fs/fscache/internal.h
@@ -57,6 +57,7 @@ extern struct fscache_cookie *fscache_alloc_cookie(struct fscache_cookie *,
const void *, size_t,
void *, loff_t);
extern struct fscache_cookie *fscache_hash_cookie(struct fscache_cookie *);
+extern void fscache_unhash_cookie(struct fscache_cookie *);
extern void fscache_cookie_put(struct fscache_cookie *,
enum fscache_cookie_trace);

@@ -227,6 +228,7 @@ extern atomic_t fscache_n_acquires_no_cache;
extern atomic_t fscache_n_acquires_ok;
extern atomic_t fscache_n_acquires_nobufs;
extern atomic_t fscache_n_acquires_oom;
+extern atomic_t fscache_n_acquires_wait_relinq;

extern atomic_t fscache_n_invalidates;
extern atomic_t fscache_n_invalidates_run;
diff --git a/fs/fscache/object.c b/fs/fscache/object.c
index 6d9cb1719de5..d16737e0307a 100644
--- a/fs/fscache/object.c
+++ b/fs/fscache/object.c
@@ -706,7 +706,7 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob
struct fscache_object *parent = object->parent;
struct fscache_cookie *cookie = object->cookie;
struct fscache_cache *cache = object->cache;
- bool awaken = false;
+ int awaken = 0;

_enter("{OBJ%x,%d},%d", object->debug_id, object->n_children, event);

@@ -723,13 +723,18 @@ static const struct fscache_state *fscache_drop_object(struct fscache_object *ob
*/
spin_lock(&cookie->lock);
hlist_del_init(&object->cookie_link);
- if (hlist_empty(&cookie->backing_objects) &&
- test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
- awaken = true;
+ if (hlist_empty(&cookie->backing_objects)) {
+ if (test_bit(FSCACHE_COOKIE_RELINQUISHING, &cookie->flags))
+ awaken = 2;
+ else if (test_and_clear_bit(FSCACHE_COOKIE_INVALIDATING, &cookie->flags))
+ awaken = 1;
+ }
spin_unlock(&cookie->lock);

if (awaken)
wake_up_bit(&cookie->flags, FSCACHE_COOKIE_INVALIDATING);
+ if (awaken == 2)
+ fscache_unhash_cookie(cookie);
if (test_and_clear_bit(FSCACHE_COOKIE_LOOKING_UP, &cookie->flags))
wake_up_bit(&cookie->flags, FSCACHE_COOKIE_LOOKING_UP);

diff --git a/fs/fscache/stats.c b/fs/fscache/stats.c
index 00564a1dfd76..ca53e84cb7ea 100644
--- a/fs/fscache/stats.c
+++ b/fs/fscache/stats.c
@@ -80,6 +80,7 @@ atomic_t fscache_n_acquires_no_cache;
atomic_t fscache_n_acquires_ok;
atomic_t fscache_n_acquires_nobufs;
atomic_t fscache_n_acquires_oom;
+atomic_t fscache_n_acquires_wait_relinq;

atomic_t fscache_n_invalidates;
atomic_t fscache_n_invalidates_run;
@@ -163,13 +164,14 @@ int fscache_stats_show(struct seq_file *m, void *v)
atomic_read(&fscache_n_uncaches));

seq_printf(m, "Acquire: n=%u nul=%u noc=%u ok=%u nbf=%u"
- " oom=%u\n",
+ " oom=%u wrq=%u\n",
atomic_read(&fscache_n_acquires),
atomic_read(&fscache_n_acquires_null),
atomic_read(&fscache_n_acquires_no_cache),
atomic_read(&fscache_n_acquires_ok),
atomic_read(&fscache_n_acquires_nobufs),
- atomic_read(&fscache_n_acquires_oom));
+ atomic_read(&fscache_n_acquires_oom),
+ atomic_read(&fscache_n_acquires_wait_relinq));

seq_printf(m, "Lookups: n=%u neg=%u pos=%u crt=%u tmo=%u\n",
atomic_read(&fscache_n_object_lookups),
diff --git a/include/trace/events/fscache.h b/include/trace/events/fscache.h
index 686cfe997ed2..d9a969642da7 100644
--- a/include/trace/events/fscache.h
+++ b/include/trace/events/fscache.h
@@ -30,11 +30,13 @@ enum fscache_cookie_trace {
fscache_cookie_get_attach_object,
fscache_cookie_get_reacquire,
fscache_cookie_get_register_netfs,
+ fscache_cookie_get_wait,
fscache_cookie_put_acquire_nobufs,
fscache_cookie_put_dup_netfs,
fscache_cookie_put_relinquish,
fscache_cookie_put_object,
fscache_cookie_put_parent,
+ fscache_cookie_put_wait,
};

enum fscache_page_trace {
@@ -96,11 +98,13 @@ enum fscache_page_op_trace {
EM(fscache_cookie_get_attach_object, "GET obj") \
EM(fscache_cookie_get_reacquire, "GET raq") \
EM(fscache_cookie_get_register_netfs, "GET net") \
+ EM(fscache_cookie_get_wait, "GET wai") \
EM(fscache_cookie_put_acquire_nobufs, "PUT nbf") \
EM(fscache_cookie_put_dup_netfs, "PUT dnt") \
EM(fscache_cookie_put_relinquish, "PUT rlq") \
EM(fscache_cookie_put_object, "PUT obj") \
- E_(fscache_cookie_put_parent, "PUT prn")
+ EM(fscache_cookie_put_parent, "PUT prn") \
+ E_(fscache_cookie_put_wait, "PUT wai")

#define fscache_page_traces \
EM(fscache_page_cached, "Cached ") \


2019-03-11 14:51:41

by David Howells

[permalink] [raw]
Subject: [PATCH 2/3] fscache: Add missing unhash

In __fscache_acquire_cookie() if fscache_acquire_non_index_cookie() fails,
we clean up and return NULL (indicating no cookie) - however, we don't
unhash the previously hashed object first.

Fix this by moving the failure code out of the main path through the
function to its own goto label and add the missing fscache_unhash_cookie()
call.

Moving the code out to its own label allows the remaining part of the
if-statement to be simplified also.

Fixes: ec0328e46d6e ("fscache: Maintain a catalogue of allocated cookies")
Signed-off-by: David Howells <[email protected]>
---

fs/fscache/cookie.c | 26 ++++++++++++--------------
1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 75d0ffd36ac0..6931e45b7d59 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -346,20 +346,10 @@ struct fscache_cookie *__fscache_acquire_cookie(
/* if the object is an index then we need do nothing more here
* - we create indices on disk when we need them as an index
* may exist in multiple caches */
- if (cookie->type != FSCACHE_COOKIE_TYPE_INDEX) {
- if (fscache_acquire_non_index_cookie(cookie, object_size) == 0) {
- set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
- } else {
- atomic_dec(&parent->n_children);
- fscache_cookie_put(cookie,
- fscache_cookie_put_acquire_nobufs);
- fscache_stat(&fscache_n_acquires_nobufs);
- _leave(" = NULL");
- return NULL;
- }
- } else {
- set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
- }
+ if (cookie->type != FSCACHE_COOKIE_TYPE_INDEX &&
+ fscache_acquire_non_index_cookie(cookie, object_size) < 0)
+ goto failed;
+ set_bit(FSCACHE_COOKIE_ENABLED, &cookie->flags);
}

fscache_stat(&fscache_n_acquires_ok);
@@ -367,6 +357,14 @@ struct fscache_cookie *__fscache_acquire_cookie(
out:
fscache_free_cookie(candidate);
return cookie;
+
+failed:
+ atomic_dec(&parent->n_children);
+ fscache_unhash_cookie(cookie);
+ fscache_cookie_put(cookie, fscache_cookie_put_acquire_nobufs);
+ fscache_stat(&fscache_n_acquires_nobufs);
+ _leave(" = NULL");
+ return NULL;
}
EXPORT_SYMBOL(__fscache_acquire_cookie);



2019-03-11 14:53:41

by David Howells

[permalink] [raw]
Subject: [PATCH 3/3] fscache: Print cookie debugging information


---

fs/afs/cache.c | 11 +++++++++++
fs/fscache/cookie.c | 7 +++++--
fs/nfs/fscache-index.c | 11 +++++++++++
include/linux/fscache.h | 2 ++
4 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/fs/afs/cache.c b/fs/afs/cache.c
index f6d0a21e8052..c16b10dcf895 100644
--- a/fs/afs/cache.c
+++ b/fs/afs/cache.c
@@ -32,10 +32,21 @@ struct fscache_cookie_def afs_volume_cache_index_def = {
.type = FSCACHE_COOKIE_TYPE_INDEX,
};

+static void afs_vnode_print_cookie(void *cookie_netfs_data, char prefix)
+{
+ struct afs_vnode *vnode = cookie_netfs_data;
+
+ pr_err("kAFS : %c-vnode %llx:%llx:%x\n",
+ prefix, vnode->fid.vid, vnode->fid.vnode, vnode->fid.unique);
+ pr_err("kAFS : %c-vol v=%p %llx\n",
+ prefix, vnode->volume, vnode->volume->vid);
+}
+
struct fscache_cookie_def afs_vnode_cache_index_def = {
.name = "AFS.vnode",
.type = FSCACHE_COOKIE_TYPE_DATAFILE,
.check_aux = afs_vnode_cache_check_aux,
+ .print = afs_vnode_print_cookie,
};

/*
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c
index 6931e45b7d59..50db256e6593 100644
--- a/fs/fscache/cookie.c
+++ b/fs/fscache/cookie.c
@@ -41,8 +41,9 @@ static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
prefix, cookie, cookie->parent, cookie->flags,
atomic_read(&cookie->n_children),
atomic_read(&cookie->n_active));
- pr_err("%c-cookie d=%p n=%p\n",
- prefix, cookie->def, cookie->netfs_data);
+ pr_err("%c-cookie d=%p n=%p [%s]\n",
+ prefix, cookie->def, cookie->netfs_data,
+ cookie->def ? cookie->def->name : "");

object = READ_ONCE(cookie->backing_objects.first);
if (object)
@@ -55,6 +56,8 @@ static void fscache_print_cookie(struct fscache_cookie *cookie, char prefix)
for (loop = 0; loop < cookie->key_len; loop++)
pr_cont("%02x", k[loop]);
pr_cont("'\n");
+ if (cookie->def && cookie->def->print)
+ cookie->def->print(cookie->netfs_data, prefix);
}

void fscache_free_cookie(struct fscache_cookie *cookie)
diff --git a/fs/nfs/fscache-index.c b/fs/nfs/fscache-index.c
index 666415d13d52..bba6753cdfb7 100644
--- a/fs/nfs/fscache-index.c
+++ b/fs/nfs/fscache-index.c
@@ -124,6 +124,16 @@ static void nfs_fh_put_context(void *cookie_netfs_data, void *context)
put_nfs_open_context(context);
}

+static void nfs_inode_print_cookie(void *cookie_netfs_data, char prefix)
+{
+ struct nfs_inode *nfsi = cookie_netfs_data;
+ struct inode *i = &nfsi->vfs_inode;
+
+ pr_err("NFS : %c-nfs fl=%lx cv=%lx ii=%lx if=%x sb=%p sf=%lx\n",
+ prefix, nfsi->flags, nfsi->cache_validity,
+ i->i_ino, i->i_flags, i->i_sb, i->i_sb->s_flags);
+}
+
/*
* Define the inode object for FS-Cache. This is used to describe an inode
* object to fscache_acquire_cookie(). It is keyed by the NFS file handle for
@@ -139,4 +149,5 @@ const struct fscache_cookie_def nfs_fscache_inode_object_def = {
.check_aux = nfs_fscache_inode_check_aux,
.get_context = nfs_fh_get_context,
.put_context = nfs_fh_put_context,
+ .print = nfs_inode_print_cookie,
};
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index 84b90a79d75a..84525722522e 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -114,6 +114,8 @@ struct fscache_cookie_def {
void (*mark_page_cached)(void *cookie_netfs_data,
struct address_space *mapping,
struct page *page);
+
+ void (*print)(void *cookie_netfs_data, char prefix);
};

/*


2019-03-12 06:08:15

by Christian Kujau

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

On Mon, 11 Mar 2019, David Howells wrote:
> I've a couple more patches for you - one a bugfix and one that will print more
> information. They don't actually affect the problem you're seeing. I'll post
> them as replies to this message.

Thanks for the patches. I've applied all three to v5.0 and ran
"nfstest_cache" and was able to reproduce the messages. Please note that
I'm only running "nfstest_cache" because it's somehow able to reproduce
the message reliably - otherwise the message just shows up once or twice
in syslog, but I didn't know how to reproduce it.

But I noticed something else this time, and I did not notice that before:
while running nfstest_cache, the "duplicate cookie" messages were only
triggered when my other, non-test mount was also mounted during the test.
Let me describe my F29 test VM again:

* VM boots, and /usr/local/src gets mounted via NFS, read-only, and
with w/o fsc options. cachefilesd isn't even installed here.

* I run nfstest_cache and apparently it's mounting the same NFS export
from the server to /mnt/t, as a readonly mount.

So two mounts, one in /usr/local/src, the other in /mnt/t, both readonly
and both w/o "fsc", but the "duplicate cookie" message is only printed
when /usr/local/src was mounted. If /usr/local/src wasn't mounted, the
test would complete[0] and no "duplicate message was printed. And then I
noticed:

----------------------------------------------------------
$ mount | tail -2 | fold
horus:/usr/local/src on /usr/local/src type nfs4
(ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,
hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,
local_lock=none,addr=192.168.0.115)

horus:/ on /mnt/t type nfs4
(rw,relatime,vers=4.1,rsize=4096,wsize=4096,namlen=255,
hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,
local_lock=none,addr=192.168.0.115)
----------------------------------------------------------

My /usr/local/src mount was mounted with vers=4.2 (default), while
nfstest_cache was mounting its test-mount with vers=4.1! Apart from the
different rsize/wsize values, the version number stood out. And indeed,
when I mount my regular NFS mount /usr/local/src with vers=4.1, the
"duplicate cookie" is no longer printed.

For simplicity, I've attached two logs to this email:

* nfs_no-mount.txt.xz - showing /proc/fs/nfsfs/volumes and
/proc/fs/fscache/stats every 0.01 seconds, while running nfstest_cache
in another terminal. Note that no duplicate "cookie messages" were
triggered, as /usr/local/src was not mounted.

* nfs_with-mount.txt.xz - same, but here /usr/local/src was mounted (and
defaulted to vers=4.2), and thus "duplicate cookie" messages were
printed.


I fear that all this may complicate this strange behaviour, and now we're
examining NFS mount versions, but I only noticed that now, not earlier :-\

I can't comment on the patches much, as you mentioned they won't make the
message go away, but I hope it printed more details now.

Thanks,
Christian.


[0] Again, I'm using nfstest_cache only to trigger the message. Everytime
I execute it, the test fails, because I think it expects a rw-mount:

$ nfstest_cache --server horus --client fedora0 --runtest=acregmin_attr
*** Verify consistency of attribute caching with NFSv4.1 on a file
acregmin = 10
TEST: Running test 'acregmin_attr'
FAIL: Traceback (most recent call last):
File "/usr/bin/nfstest_cache", line 199, in do_file_test
fdw = open(self.absfile, "w")
IOError: [Errno 30] Read-only file system:
'/mnt/t/nfstest_cache_20190311223404_f_1'
TIME: 4.497078s
1 tests (0 passed, 1 failed)
Total time: 5.529826s

--
BOFH excuse #209:

Only people with names beginning with 'A' are getting mail this week (a la Microsoft)


Attachments:
nfs_with-mount.txt.xz (5.16 kB)
nfs_no-mount.txt.xz (4.71 kB)
Download all attachments

2019-03-12 07:02:37

by David Howells

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

Christian Kujau <[email protected]> wrote:

> My /usr/local/src mount was mounted with vers=4.2 (default), while
> nfstest_cache was mounting its test-mount with vers=4.1! Apart from the
> different rsize/wsize values, the version number stood out. And indeed,
> when I mount my regular NFS mount /usr/local/src with vers=4.1, the
> "duplicate cookie" is no longer printed.

Yeah - NFS superblocks are differentiated by a whole host of parameters,
including protocol version number, and caches aren't shared between
superblocks because this introduces a tricky coherency problem.

The issue is that NFS superblocks to the same place do not currently manage
coherency (inode attributes, data) between themselves, except via the server.

However, if "fsc" isn't given on the mount commandline, the superblock
probably shouldn't get a server-level cookie if we can avoid it.

David

2019-03-31 09:06:27

by Christian Kujau

[permalink] [raw]
Subject: Re: FS-Cache: Duplicate cookie detected

Hi David,

On Tue, 12 Mar 2019, David Howells wrote:
> > My /usr/local/src mount was mounted with vers=4.2 (default), while
> > nfstest_cache was mounting its test-mount with vers=4.1! Apart from the
> > different rsize/wsize values, the version number stood out. And indeed,
> > when I mount my regular NFS mount /usr/local/src with vers=4.1, the
> > "duplicate cookie" is no longer printed.
>
> Yeah - NFS superblocks are differentiated by a whole host of parameters,
> including protocol version number, and caches aren't shared between
> superblocks because this introduces a tricky coherency problem.
>
> The issue is that NFS superblocks to the same place do not currently manage
> coherency (inode attributes, data) between themselves, except via the server.
>
> However, if "fsc" isn't given on the mount commandline, the superblock
> probably shouldn't get a server-level cookie if we can avoid it.

Just checking - are you waiting for new results from me, should I test
something that I missed? Or are new patches in the works? :-D

Thanks,
Christian.
--
BOFH excuse #139:

UBNC (user brain not connected)

2019-08-06 18:52:04

by Guilherme Piccoli

[permalink] [raw]
Subject: Re: [PATCH 1/3] fscache: Fix cookie collision

Hi David, was there any respin for this patch? I couldn't find it upstream.

This message shows a lot in the xfstests against cifs.
Thanks in advance,


Guilherme