2014-08-15 14:45:23

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 0/7] nfs-utils: support for lifting grace period early

This patchset adds some support to sm-notify and nfsdcltrack for lifting
the grace periods early. Allowing this actually work depends on the
companion kernel patchset, but the approach I've taken here should deal
properly with userland/kernel mismatch.

There are two main pieces:

sm-notify: in the event that sm-notify isn't sending any NOTIFY
requests, we don't expect to see any reclaims from clients. In that
case, we should be able to safely lift the lockd grace period early.
The first patch in the series implements this (though we'll probably
need a bit of selinux work to get that working in Fedora under enforcing
mode).

nfsdcltrack: if there are no v4.0 clients and all v4.1+ clients have
issued a RECLAIM_COMPLETE, then we can go ahead and end the nfsd grace
period. The remainder of the patchset adds the support for this. This
requires revving the DB schema for it, and making use of the environment
variables that are passed to the upcall by the kernel.

Again, this is just an RFC set for now. Does anyone have thoughts or
comments on the general approach?

Jeff Layton (7):
sm-notify: inform the kernel if there were no hosts to notify
nfsdcltrack: update comments in sqlite.c
nfsdcltrack: rename CLD_* constants with CLTRACK_* prefixes
nfsdcltrack: overhaul database initializtion
nfsdcltrack: update schema to v2
nfsdcltrack: grab the client minorversion from the env var if it's
present
nfsdcltrack: fetch NFSDCLTRACK_GRACE_START out of environment

utils/nfsdcltrack/nfsdcltrack.c | 106 +++++++++++-
utils/nfsdcltrack/sqlite.c | 375 ++++++++++++++++++++++++++++++----------
utils/nfsdcltrack/sqlite.h | 5 +-
utils/statd/sm-notify.c | 25 +++
4 files changed, 413 insertions(+), 98 deletions(-)

--
1.9.3



2014-08-18 20:59:34

by Jeff Layton

[permalink] [raw]
Subject: Re: [nfs-utils RFC PATCH 0/7] nfs-utils: support for lifting grace period early

On Mon, 18 Aug 2014 16:04:56 -0400
"J. Bruce Fields" <[email protected]> wrote:

> Thanks for working on this, it's currently a major annoyance!
>

Thanks for looking at it. I think I might go with a different scheme
for managing the v4.x grace lifting which won't involve new procfiles,
but I need to code it up and test it out. Stay tuned for v2, but I'd
appreciate any feedback that you have on this set.

> On Fri, Aug 15, 2014 at 10:45:08AM -0400, Jeff Layton wrote:
> > This patchset adds some support to sm-notify and nfsdcltrack for lifting
> > the grace periods early. Allowing this actually work depends on the
> > companion kernel patchset, but the approach I've taken here should deal
> > properly with userland/kernel mismatch.
> >
> > There are two main pieces:
> >
> > sm-notify: in the event that sm-notify isn't sending any NOTIFY
> > requests, we don't expect to see any reclaims from clients. In that
> > case, we should be able to safely lift the lockd grace period early.
> > The first patch in the series implements this (though we'll probably
> > need a bit of selinux work to get that working in Fedora under enforcing
> > mode).
> >
> > nfsdcltrack: if there are no v4.0 clients and all v4.1+ clients have
> > issued a RECLAIM_COMPLETE, then we can go ahead and end the nfsd grace
> > period. The remainder of the patchset adds the support for this. This
> > requires revving the DB schema for it, and making use of the environment
> > variables that are passed to the upcall by the kernel.
>
> And we get to skip the grace period on first start, and on any restart
> during which no clients hold state, which seem like big improvements on
> their own.
>

Agreed.

> Mostly a pointless digression, but I've been wondering if there are any
> further improvements that are worth it. I'm feeling skeptical:
>
> As long as nobody uses deny modes we could deal with them with a big
> dumb hammer (e.g. synchronously record something if anyone requests one,
> insist on a full grace period if they do).
>
> But we're still stuck blocking all new opens as long as there are
> delegations outstanding. (Well, maybe only new read opens, until we get
> write delegations.)
>

I think you mean "write opens" there, but yes.

> Would it work to record a count of delegations on shutdown, to speed up
> the restart-after-clean-shutdown case? And if it worked, would it be
> worth the effort? (Alternatively on shutdown we could stop giving out
> delegations, recall everything, and wait till we had no delegations till
> shutting down completely. Wonder how painful that would be.)
>
> So anyway maybe our best bet is just to hope people upgrade to 4.1 soon.
>

Those all seem a bit tricky to get right. Slowing down reboots will
suck. From the client standpoint that seems somewhat equivalent to a
grace period delay. The client is stuck either way.

Counting and recording delegations is an interesting idea, but you'd
have to take care not to record anything until the grace period has
ended.

That said, I personally don't have a lot of interest in speeding up
v4.0 reclaim. I'm not opposed to any of the above schemes, but I think
we'd be better off just pushing people toward v4.1 if they want the
grace period to be shorter.

Given the annoyance of the 90s grace period, that might act as a carrot
for encouraging people to switch.

On that same note, we probably need to discuss switching the default in
mount.nfs to v4.1 sometime soon. Maybe a topic for discussion at the
fall BaT?

--
Jeff Layton <[email protected]>

2014-08-15 14:45:24

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 1/7] sm-notify: inform the kernel if there were no hosts to notify

In the event that there no hosts to be notified after a reboot, there's
no real reason to force lockd to wait the entire grace period before
handing out locks. We're not expecting any reclaim requests to come in
that situation.

Have sm-notify do a write to /proc/fs/lockd/nlm_end_grace if that file
is present. That informs the kernel that it's OK to go ahead and lift
lockd's grace period early.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/statd/sm-notify.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/utils/statd/sm-notify.c b/utils/statd/sm-notify.c
index 9dbe5d908336..828a6991f8e6 100644
--- a/utils/statd/sm-notify.c
+++ b/utils/statd/sm-notify.c
@@ -42,6 +42,8 @@
#define NSM_TIMEOUT 2
#define NSM_MAX_TIMEOUT 120 /* don't make this too big */

+#define NLM_END_GRACE_FILE "/proc/fs/lockd/nlm_end_grace"
+
struct nsm_host {
struct nsm_host * next;
char * name;
@@ -450,6 +452,28 @@ retry:
return sock;
}

+/* Inform the kernel that it's OK to lift lockd's grace period */
+static void
+nsm_lift_grace_period(void)
+{
+ int fd;
+
+ fd = open(NLM_END_GRACE_FILE, O_WRONLY);
+ if (fd < 0) {
+ /* Don't warn if file isn't present */
+ if (errno != ENOENT)
+ xlog(L_WARNING, "Unable to open %s: %m",
+ NLM_END_GRACE_FILE);
+ return;
+ }
+
+ if (write(fd, "Y", 1) < 0)
+ xlog(L_WARNING, "Unable to write to %s: %m", NLM_END_GRACE_FILE);
+
+ close(fd);
+ return;
+}
+
int
main(int argc, char **argv)
{
@@ -534,6 +558,7 @@ usage: fprintf(stderr,
(void)nsm_retire_monitored_hosts();
if (nsm_load_notify_list(smn_get_host) == 0) {
xlog(D_GENERAL, "No hosts to notify; exiting");
+ nsm_lift_grace_period();
return 0;
}

--
1.9.3


2014-08-18 20:04:56

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [nfs-utils RFC PATCH 0/7] nfs-utils: support for lifting grace period early

Thanks for working on this, it's currently a major annoyance!

On Fri, Aug 15, 2014 at 10:45:08AM -0400, Jeff Layton wrote:
> This patchset adds some support to sm-notify and nfsdcltrack for lifting
> the grace periods early. Allowing this actually work depends on the
> companion kernel patchset, but the approach I've taken here should deal
> properly with userland/kernel mismatch.
>
> There are two main pieces:
>
> sm-notify: in the event that sm-notify isn't sending any NOTIFY
> requests, we don't expect to see any reclaims from clients. In that
> case, we should be able to safely lift the lockd grace period early.
> The first patch in the series implements this (though we'll probably
> need a bit of selinux work to get that working in Fedora under enforcing
> mode).
>
> nfsdcltrack: if there are no v4.0 clients and all v4.1+ clients have
> issued a RECLAIM_COMPLETE, then we can go ahead and end the nfsd grace
> period. The remainder of the patchset adds the support for this. This
> requires revving the DB schema for it, and making use of the environment
> variables that are passed to the upcall by the kernel.

And we get to skip the grace period on first start, and on any restart
during which no clients hold state, which seem like big improvements on
their own.

Mostly a pointless digression, but I've been wondering if there are any
further improvements that are worth it. I'm feeling skeptical:

As long as nobody uses deny modes we could deal with them with a big
dumb hammer (e.g. synchronously record something if anyone requests one,
insist on a full grace period if they do).

But we're still stuck blocking all new opens as long as there are
delegations outstanding. (Well, maybe only new read opens, until we get
write delegations.)

Would it work to record a count of delegations on shutdown, to speed up
the restart-after-clean-shutdown case? And if it worked, would it be
worth the effort? (Alternatively on shutdown we could stop giving out
delegations, recall everything, and wait till we had no delegations till
shutting down completely. Wonder how painful that would be.)

So anyway maybe our best bet is just to hope people upgrade to 4.1 soon.

--b.

>
> Again, this is just an RFC set for now. Does anyone have thoughts or
> comments on the general approach?
>
> Jeff Layton (7):
> sm-notify: inform the kernel if there were no hosts to notify
> nfsdcltrack: update comments in sqlite.c
> nfsdcltrack: rename CLD_* constants with CLTRACK_* prefixes
> nfsdcltrack: overhaul database initializtion
> nfsdcltrack: update schema to v2
> nfsdcltrack: grab the client minorversion from the env var if it's
> present
> nfsdcltrack: fetch NFSDCLTRACK_GRACE_START out of environment
>
> utils/nfsdcltrack/nfsdcltrack.c | 106 +++++++++++-
> utils/nfsdcltrack/sqlite.c | 375 ++++++++++++++++++++++++++++++----------
> utils/nfsdcltrack/sqlite.h | 5 +-
> utils/statd/sm-notify.c | 25 +++
> 4 files changed, 413 insertions(+), 98 deletions(-)
>
> --
> 1.9.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2014-08-15 14:45:30

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 5/7] nfsdcltrack: update schema to v2

From: Jeff Layton <[email protected]>

In order to allow knfsd's lock manager to lift its grace period early,
we need to figure out whether all clients have finished reclaiming
their state not. Unfortunately, the current code doesn't allow us to
ascertain this. All we track for each client is a timestamp that tells
us when the last "check" or "create" operation came in.

We need to track the two timestamps separately. Add a new
"reclaim_complete" column to the database that tells us when the last
"create" operation came in. For now, we just insert "0" in that column
but a later patch will make it so that we insert a real timestamp for
v4.1+ client records.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/nfsdcltrack/sqlite.c | 102 +++++++++++++++++++++++++++++++++++++++++----
1 file changed, 94 insertions(+), 8 deletions(-)

diff --git a/utils/nfsdcltrack/sqlite.c b/utils/nfsdcltrack/sqlite.c
index dde2f7ff8621..e260e81b1722 100644
--- a/utils/nfsdcltrack/sqlite.c
+++ b/utils/nfsdcltrack/sqlite.c
@@ -29,7 +29,10 @@
*
* clients: an "id" column containing a BLOB with the long-form clientid as
* sent by the client, a "time" column containing a timestamp (in
- * epoch seconds) of when the record was last updated.
+ * epoch seconds) of when the record was last updated, and a
+ * "reclaim_complete" column containing a timestamp (in epoch seconds)
+ * of when the last "create" operation came in for v4.1+ clients.
+ * v4.0 clients should always have this set to 0.
*/

#ifdef HAVE_CONFIG_H
@@ -50,7 +53,7 @@

#include "xlog.h"

-#define CLTRACK_SQLITE_LATEST_SCHEMA_VERSION 1
+#define CLTRACK_SQLITE_LATEST_SCHEMA_VERSION 2

/* in milliseconds */
#define CLTRACK_SQLITE_BUSY_TIMEOUT 10000
@@ -120,6 +123,81 @@ out:
return ret;
}

+static int
+sqlite_maindb_update_v1_to_v2(void)
+{
+ int ret, ret2;
+ char *err;
+
+ /* begin transaction */
+ ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL,
+ &err);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to begin transaction: %s", err);
+ goto rollback;
+ }
+
+ /*
+ * Check schema version again. This time, under an exclusive
+ * transaction to guard against racing DB setup attempts
+ */
+ ret = sqlite_query_schema_version();
+ switch (ret) {
+ case 1:
+ /* Still at v1 -- do conversion */
+ break;
+ case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION:
+ /* Someone else raced in and set it up */
+ ret = 0;
+ goto rollback;
+ default:
+ /* Something went wrong -- fail! */
+ ret = -EINVAL;
+ goto rollback;
+ }
+
+ /* create v2 clients table */
+ ret = sqlite3_exec(dbh, "ALTER TABLE clients ADD COLUMN "
+ "reclaim_complete INTEGER;",
+ NULL, NULL, &err);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to update clients table: %s", err);
+ goto rollback;
+ }
+
+ ret = snprintf(buf, sizeof(buf), "UPDATE parameters SET value = %d "
+ "WHERE key = \"version\";",
+ CLTRACK_SQLITE_LATEST_SCHEMA_VERSION);
+ if (ret < 0) {
+ xlog(L_ERROR, "sprintf failed!");
+ goto rollback;
+ } else if ((size_t)ret >= sizeof(buf)) {
+ xlog(L_ERROR, "sprintf output too long! (%d chars)", ret);
+ ret = -EINVAL;
+ goto rollback;
+ }
+
+ ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to update schema version: %s", err);
+ goto rollback;
+ }
+
+ ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to commit transaction: %s", err);
+ goto rollback;
+ }
+out:
+ sqlite3_free(err);
+ return ret;
+rollback:
+ ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err);
+ if (ret2 != SQLITE_OK)
+ xlog(L_ERROR, "Unable to rollback transaction: %s", err);
+ goto out;
+}
+
/*
* Start an exclusive transaction and recheck the DB schema version. If it's
* still zero (indicating a new database) then set it up. If that all works,
@@ -127,9 +205,9 @@ out:
* transaction. On any error, rollback the transaction.
*/
int
-sqlite_maindb_init_v1(void)
+sqlite_maindb_init_v2(void)
{
- int ret;
+ int ret, ret2;
char *err = NULL;

/* Start a transaction */
@@ -169,7 +247,7 @@ sqlite_maindb_init_v1(void)

/* create the "clients" table */
ret = sqlite3_exec(dbh, "CREATE TABLE clients (id BLOB PRIMARY KEY, "
- "time INTEGER);",
+ "time INTEGER, reclaim_complete INTEGER);",
NULL, NULL, &err);
if (ret != SQLITE_OK) {
xlog(L_ERROR, "Unable to create clients table: %s", err);
@@ -207,7 +285,9 @@ out:

rollback:
/* Attempt to rollback the transaction */
- sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err);
+ ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err);
+ if (ret2 != SQLITE_OK)
+ xlog(L_ERROR, "Unable to rollback transaction: %s", err);
goto out;
}

@@ -255,9 +335,15 @@ sqlite_prepare_dbh(const char *topdir)
/* DB is already set up. Do nothing */
ret = 0;
break;
+ case 1:
+ /* Old DB -- update to new schema */
+ ret = sqlite_maindb_update_v1_to_v2();
+ if (ret)
+ goto out_close;
+ break;
case 0:
/* Query failed -- try to set up new DB */
- ret = sqlite_maindb_init_v1();
+ ret = sqlite_maindb_init_v2();
if (ret)
goto out_close;
break;
@@ -289,7 +375,7 @@ sqlite_insert_client(const unsigned char *clname, const size_t namelen)
sqlite3_stmt *stmt = NULL;

ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients VALUES "
- "(?, strftime('%s', 'now'));", -1,
+ "(?, strftime('%s', 'now'), 0);", -1,
&stmt, NULL);
if (ret != SQLITE_OK) {
xlog(L_ERROR, "%s: insert statement prepare failed: %s",
--
1.9.3


2014-08-19 14:49:33

by Jeff Layton

[permalink] [raw]
Subject: Re: [nfs-utils RFC PATCH 0/7] nfs-utils: support for lifting grace period early

On Mon, 18 Aug 2014 16:59:26 -0400
Jeff Layton <[email protected]> wrote:

> On Mon, 18 Aug 2014 16:04:56 -0400
> "J. Bruce Fields" <[email protected]> wrote:
>
> > Thanks for working on this, it's currently a major annoyance!
> >
>
> Thanks for looking at it. I think I might go with a different scheme
> for managing the v4.x grace lifting which won't involve new procfiles,
> but I need to code it up and test it out. Stay tuned for v2, but I'd
> appreciate any feedback that you have on this set.
>

Hmm, now that I've looked, I'm not sure the other scheme is really
feasible. What I wanted to do was to have nfsdcltrack return a distinct
error code that would indicate that all hosts have finished reclaiming.
That would avoid the need for the v4_end_grace file.

The problem there is that the range of return codes that you get back
via the UMH interface is pretty limited. Given the fact that I wasn't
terribly careful about sanitizing the exit status in earlier versions
of nfsdcltrack (mea culpa), we might should just stick with something
along the lines of what I've proposed here.

I'll plan to go with this unless there anyone has a better idea...

> > On Fri, Aug 15, 2014 at 10:45:08AM -0400, Jeff Layton wrote:
> > > This patchset adds some support to sm-notify and nfsdcltrack for lifting
> > > the grace periods early. Allowing this actually work depends on the
> > > companion kernel patchset, but the approach I've taken here should deal
> > > properly with userland/kernel mismatch.
> > >
> > > There are two main pieces:
> > >
> > > sm-notify: in the event that sm-notify isn't sending any NOTIFY
> > > requests, we don't expect to see any reclaims from clients. In that
> > > case, we should be able to safely lift the lockd grace period early.
> > > The first patch in the series implements this (though we'll probably
> > > need a bit of selinux work to get that working in Fedora under enforcing
> > > mode).
> > >
> > > nfsdcltrack: if there are no v4.0 clients and all v4.1+ clients have
> > > issued a RECLAIM_COMPLETE, then we can go ahead and end the nfsd grace
> > > period. The remainder of the patchset adds the support for this. This
> > > requires revving the DB schema for it, and making use of the environment
> > > variables that are passed to the upcall by the kernel.
> >
> > And we get to skip the grace period on first start, and on any restart
> > during which no clients hold state, which seem like big improvements on
> > their own.
> >
>
> Agreed.
>
> > Mostly a pointless digression, but I've been wondering if there are any
> > further improvements that are worth it. I'm feeling skeptical:
> >
> > As long as nobody uses deny modes we could deal with them with a big
> > dumb hammer (e.g. synchronously record something if anyone requests one,
> > insist on a full grace period if they do).
> >
> > But we're still stuck blocking all new opens as long as there are
> > delegations outstanding. (Well, maybe only new read opens, until we get
> > write delegations.)
> >
>
> I think you mean "write opens" there, but yes.
>
> > Would it work to record a count of delegations on shutdown, to speed up
> > the restart-after-clean-shutdown case? And if it worked, would it be
> > worth the effort? (Alternatively on shutdown we could stop giving out
> > delegations, recall everything, and wait till we had no delegations till
> > shutting down completely. Wonder how painful that would be.)
> >
> > So anyway maybe our best bet is just to hope people upgrade to 4.1 soon.
> >
>
> Those all seem a bit tricky to get right. Slowing down reboots will
> suck. From the client standpoint that seems somewhat equivalent to a
> grace period delay. The client is stuck either way.
>
> Counting and recording delegations is an interesting idea, but you'd
> have to take care not to record anything until the grace period has
> ended.
>
> That said, I personally don't have a lot of interest in speeding up
> v4.0 reclaim. I'm not opposed to any of the above schemes, but I think
> we'd be better off just pushing people toward v4.1 if they want the
> grace period to be shorter.
>
> Given the annoyance of the 90s grace period, that might act as a carrot
> for encouraging people to switch.
>
> On that same note, we probably need to discuss switching the default in
> mount.nfs to v4.1 sometime soon. Maybe a topic for discussion at the
> fall BaT?
>


--
Jeff Layton <[email protected]>

2014-08-15 14:45:27

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 3/7] nfsdcltrack: rename CLD_* constants with CLTRACK_* prefixes

From: Jeff Layton <[email protected]>

Since nfsdcld has been dead for a few years now, clean up the prefixes
on the constants.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/nfsdcltrack/sqlite.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/utils/nfsdcltrack/sqlite.c b/utils/nfsdcltrack/sqlite.c
index 352f029d55c0..52fcaa4d7ca3 100644
--- a/utils/nfsdcltrack/sqlite.c
+++ b/utils/nfsdcltrack/sqlite.c
@@ -50,10 +50,10 @@

#include "xlog.h"

-#define CLD_SQLITE_SCHEMA_VERSION 1
+#define CLTRACK_SQLITE_LATEST_SCHEMA_VERSION 1

/* in milliseconds */
-#define CLD_SQLITE_BUSY_TIMEOUT 10000
+#define CLTRACK_SQLITE_BUSY_TIMEOUT 10000

/* private data structures */

@@ -111,7 +111,7 @@ sqlite_prepare_dbh(const char *topdir)
return ret;
}

- ret = sqlite3_busy_timeout(dbh, CLD_SQLITE_BUSY_TIMEOUT);
+ ret = sqlite3_busy_timeout(dbh, CLTRACK_SQLITE_BUSY_TIMEOUT);
if (ret != SQLITE_OK) {
xlog(L_ERROR, "Unable to set sqlite busy timeout: %d", ret);
sqlite3_close(dbh);
@@ -156,7 +156,7 @@ sqlite_maindb_init(const char *topdir)
/* insert version into table -- ignore error if it fails */
ret = snprintf(buf, sizeof(buf),
"INSERT OR IGNORE INTO parameters values (\"version\", "
- "\"%d\");", CLD_SQLITE_SCHEMA_VERSION);
+ "\"%d\");", CLTRACK_SQLITE_CURRENT_SCHEMA_VERSION);
if (ret < 0) {
goto out_err;
} else if ((size_t)ret >= sizeof(buf)) {
@@ -189,10 +189,10 @@ sqlite_maindb_init(const char *topdir)

/* process SELECT result */
ret = sqlite3_column_int(stmt, 0);
- if (ret != CLD_SQLITE_SCHEMA_VERSION) {
+ if (ret != CLTRACK_SQLITE_CURRENT_SCHEMA_VERSION) {
xlog(L_ERROR, "Unsupported database schema version! "
"Expected %d, got %d.",
- CLD_SQLITE_SCHEMA_VERSION, ret);
+ CLTRACK_SQLITE_CURRENT_SCHEMA_VERSION, ret);
ret = -EINVAL;
goto out_err;
}
--
1.9.3


2014-08-15 14:45:31

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 6/7] nfsdcltrack: grab the client minorversion from the env var if it's present

And set the reclaim_complete field in the DB based on whether it's zero
or not. We have no way to know for certain when a v4.0 client has
completed reclaim, so we always set that field in the DB to zero for
v4.0 clients.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/nfsdcltrack/nfsdcltrack.c | 29 +++++++++++++++++++++++++++--
utils/nfsdcltrack/sqlite.c | 16 ++++++++++++----
utils/nfsdcltrack/sqlite.h | 3 ++-
3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/utils/nfsdcltrack/nfsdcltrack.c b/utils/nfsdcltrack/nfsdcltrack.c
index f2813610be73..cbc98c7edc11 100644
--- a/utils/nfsdcltrack/nfsdcltrack.c
+++ b/utils/nfsdcltrack/nfsdcltrack.c
@@ -37,6 +37,7 @@
#include <libgen.h>
#include <sys/inotify.h>
#include <dirent.h>
+#include <limits.h>
#ifdef HAVE_SYS_CAPABILITY_H
#include <sys/prctl.h>
#include <sys/capability.h>
@@ -254,6 +255,30 @@ cltrack_init(const char __attribute__((unused)) *unused)
return ret;
}

+/*
+ * Fetch the contents of the NFSDCLTRACK_CLIENT_MINORVERSION env var. If
+ * it's not set or there is an error converting it to an unsigned int
+ * then return 0 (since the minorversion isn't reliable at that point).
+ */
+static unsigned int
+cltrack_get_minorvers(void)
+{
+ unsigned long minorvers;
+ char *end;
+ char *minorvers_str = getenv("NFSDCLTRACK_CLIENT_MINORVERSION");
+
+ if (!minorvers_str)
+ return 0;
+
+ errno = 0;
+ minorvers = strtoul(minorvers_str, &end, 0);
+ /* Problem converting or value is too large? */
+ if (errno || minorvers > UINT_MAX)
+ return 0;
+
+ return (unsigned int)minorvers;
+}
+
static int
cltrack_create(const char *id)
{
@@ -270,7 +295,7 @@ cltrack_create(const char *id)
if (len < 0)
return (int)len;

- ret = sqlite_insert_client(blob, len);
+ ret = sqlite_insert_client(blob, len, cltrack_get_minorvers());

return ret ? -EREMOTEIO : ret;
}
@@ -323,7 +348,7 @@ cltrack_check_legacy(const unsigned char *blob, const ssize_t len)
}

/* Dir exists, try to insert record into db */
- ret = sqlite_insert_client(blob, len);
+ ret = sqlite_insert_client(blob, len, 0);
if (ret) {
xlog(D_GENERAL, "Failed to insert client: %d", ret);
return -EREMOTEIO;
diff --git a/utils/nfsdcltrack/sqlite.c b/utils/nfsdcltrack/sqlite.c
index e260e81b1722..01084bb6c4d8 100644
--- a/utils/nfsdcltrack/sqlite.c
+++ b/utils/nfsdcltrack/sqlite.c
@@ -369,14 +369,22 @@ out_close:
* Returns a non-zero sqlite error code, or SQLITE_OK (aka 0)
*/
int
-sqlite_insert_client(const unsigned char *clname, const size_t namelen)
+sqlite_insert_client(const unsigned char *clname, const size_t namelen,
+ const unsigned int minorvers)
{
int ret;
sqlite3_stmt *stmt = NULL;

- ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients VALUES "
- "(?, strftime('%s', 'now'), 0);", -1,
- &stmt, NULL);
+ if (minorvers == 0)
+ ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients "
+ "VALUES (?, strftime('%s', 'now'), 0);", -1,
+ &stmt, NULL);
+ else
+ ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients "
+ "VALUES (?, strftime('%s', 'now'), "
+ "strftime('%s', 'now'));", -1,
+ &stmt, NULL);
+
if (ret != SQLITE_OK) {
xlog(L_ERROR, "%s: insert statement prepare failed: %s",
__func__, sqlite3_errmsg(dbh));
diff --git a/utils/nfsdcltrack/sqlite.h b/utils/nfsdcltrack/sqlite.h
index 3713bfb66e2f..e9cc84cbb294 100644
--- a/utils/nfsdcltrack/sqlite.h
+++ b/utils/nfsdcltrack/sqlite.h
@@ -21,7 +21,8 @@
#define _SQLITE_H_

int sqlite_prepare_dbh(const char *topdir);
-int sqlite_insert_client(const unsigned char *clname, const size_t namelen);
+int sqlite_insert_client(const unsigned char *clname, const size_t namelen,
+ const unsigned int minorvers);
int sqlite_remove_client(const unsigned char *clname, const size_t namelen);
int sqlite_check_client(const unsigned char *clname, const size_t namelen);
int sqlite_remove_unreclaimed(const time_t grace_start);
--
1.9.3


2014-08-15 14:45:28

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 4/7] nfsdcltrack: overhaul database initializtion

We have some possibility for races with nfsdcltrack when the DB schema
is upgraded.

Suppose we update the nfs-utils package on a machine after the DB has
been initialized. With the current scheme of initializing the DB only
during the "init" phase, we could end up with a new program that expects
a new schema with an old database.

We could try to do a one-time update when the package is installed, but
that could be racy. We could get an upcall between when the program is
installed and when we run the update. Also, relying on packaging to get
that right is tricky at best. To fix this, change how the database
initialization and checking of the schema revision works.

On every upcall, attempt to open the db as we normally would. If that
fails, then try to create the directory if it doesn't exist and then
retry the open. If it fails again, then give up.

If we get a successful open, then query the DB for the schema version.
If it matches what we expect, then declare success and move on. If the
query fails then assume that the DB isn't set up yet. Start an exclusive
transaction, check the schema version again and then set up the DB if no
one raced in to create it in the meantime.

This should only add a tiny bit of overhead on most upcalls (just an
extra select of the parameters table), and should improve the
performance of the "init" upcall. It'll also make it possible to handle
DB schema changes sanely.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/nfsdcltrack/nfsdcltrack.c | 2 +-
utils/nfsdcltrack/sqlite.c | 225 +++++++++++++++++++++++++---------------
utils/nfsdcltrack/sqlite.h | 1 -
3 files changed, 142 insertions(+), 86 deletions(-)

diff --git a/utils/nfsdcltrack/nfsdcltrack.c b/utils/nfsdcltrack/nfsdcltrack.c
index 4334340197b1..f2813610be73 100644
--- a/utils/nfsdcltrack/nfsdcltrack.c
+++ b/utils/nfsdcltrack/nfsdcltrack.c
@@ -241,7 +241,7 @@ cltrack_init(const char __attribute__((unused)) *unused)
}

/* set up storage db */
- ret = sqlite_maindb_init(storagedir);
+ ret = sqlite_prepare_dbh(storagedir);
if (ret) {
xlog(L_ERROR, "Failed to init database: %d", ret);
/*
diff --git a/utils/nfsdcltrack/sqlite.c b/utils/nfsdcltrack/sqlite.c
index 52fcaa4d7ca3..dde2f7ff8621 100644
--- a/utils/nfsdcltrack/sqlite.c
+++ b/utils/nfsdcltrack/sqlite.c
@@ -88,135 +88,192 @@ mkdir_if_not_exist(const char *dirname)
return ret;
}

-/* Open the database and set up the database handle for it */
-int
-sqlite_prepare_dbh(const char *topdir)
+static int
+sqlite_query_schema_version(void)
{
int ret;
+ sqlite3_stmt *stmt = NULL;

- /* Do nothing if the database handle is already set up */
- if (dbh)
- return 0;
-
- ret = snprintf(buf, PATH_MAX - 1, "%s/main.sqlite", topdir);
- if (ret < 0)
- return ret;
-
- buf[PATH_MAX - 1] = '\0';
-
- ret = sqlite3_open(buf, &dbh);
+ /* prepare select query */
+ ret = sqlite3_prepare_v2(dbh,
+ "SELECT value FROM parameters WHERE key == \"version\";",
+ -1, &stmt, NULL);
if (ret != SQLITE_OK) {
- xlog(L_ERROR, "Unable to open main database: %d", ret);
- dbh = NULL;
- return ret;
+ xlog(L_ERROR, "Unable to prepare select statement: %s",
+ sqlite3_errstr(ret));
+ ret = 0;
+ goto out;
}

- ret = sqlite3_busy_timeout(dbh, CLTRACK_SQLITE_BUSY_TIMEOUT);
- if (ret != SQLITE_OK) {
- xlog(L_ERROR, "Unable to set sqlite busy timeout: %d", ret);
- sqlite3_close(dbh);
- dbh = NULL;
+ /* query schema version */
+ ret = sqlite3_step(stmt);
+ if (ret != SQLITE_ROW) {
+ xlog(L_ERROR, "Select statement execution failed: %s",
+ sqlite3_errstr(ret));
+ ret = 0;
+ goto out;
}

+ ret = sqlite3_column_int(stmt, 0);
+out:
+ sqlite3_finalize(stmt);
return ret;
}

/*
- * Open the "main" database, and attempt to initialize it by creating the
- * parameters table and inserting the schema version into it. Ignore any errors
- * from that, and then attempt to select the version out of it again. If the
- * version appears wrong, then assume that the DB is corrupt or has been
- * upgraded, and return an error. If all of that works, then attempt to create
- * the "clients" table.
+ * Start an exclusive transaction and recheck the DB schema version. If it's
+ * still zero (indicating a new database) then set it up. If that all works,
+ * then insert schema version into the parameters table and commit the
+ * transaction. On any error, rollback the transaction.
*/
int
-sqlite_maindb_init(const char *topdir)
+sqlite_maindb_init_v1(void)
{
int ret;
char *err = NULL;
- sqlite3_stmt *stmt = NULL;

- ret = mkdir_if_not_exist(topdir);
- if (ret)
+ /* Start a transaction */
+ ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL,
+ &err);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to begin transaction: %s", err);
return ret;
+ }

- ret = sqlite_prepare_dbh(topdir);
- if (ret)
- return ret;
+ /*
+ * Check schema version again. This time, under an exclusive
+ * transaction to guard against racing DB setup attempts
+ */
+ ret = sqlite_query_schema_version();
+ switch (ret) {
+ case 0:
+ /* Query failed again -- set up DB */
+ break;
+ case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION:
+ /* Someone else raced in and set it up */
+ ret = 0;
+ goto rollback;
+ default:
+ /* Something went wrong -- fail! */
+ ret = -EINVAL;
+ goto rollback;
+ }

- /* Try to create table */
- ret = sqlite3_exec(dbh, "CREATE TABLE IF NOT EXISTS parameters "
+ ret = sqlite3_exec(dbh, "CREATE TABLE parameters "
"(key TEXT PRIMARY KEY, value TEXT);",
NULL, NULL, &err);
if (ret != SQLITE_OK) {
- xlog(L_ERROR, "Unable to create parameter table: %d", ret);
- goto out_err;
+ xlog(L_ERROR, "Unable to create parameter table: %s", err);
+ goto rollback;
}

- /* insert version into table -- ignore error if it fails */
- ret = snprintf(buf, sizeof(buf),
- "INSERT OR IGNORE INTO parameters values (\"version\", "
- "\"%d\");", CLTRACK_SQLITE_CURRENT_SCHEMA_VERSION);
+ /* create the "clients" table */
+ ret = sqlite3_exec(dbh, "CREATE TABLE clients (id BLOB PRIMARY KEY, "
+ "time INTEGER);",
+ NULL, NULL, &err);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to create clients table: %s", err);
+ goto rollback;
+ }
+
+
+ /* insert version into parameters table */
+ ret = snprintf(buf, sizeof(buf), "INSERT OR FAIL INTO parameters "
+ "values (\"version\", \"%d\");",
+ CLTRACK_SQLITE_LATEST_SCHEMA_VERSION);
if (ret < 0) {
- goto out_err;
+ xlog(L_ERROR, "sprintf failed!");
+ goto rollback;
} else if ((size_t)ret >= sizeof(buf)) {
+ xlog(L_ERROR, "sprintf output too long! (%d chars)", ret);
ret = -EINVAL;
- goto out_err;
+ goto rollback;
}

ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err);
if (ret != SQLITE_OK) {
- xlog(L_ERROR, "Unable to insert into parameter table: %d",
- ret);
- goto out_err;
+ xlog(L_ERROR, "Unable to insert into parameter table: %s", err);
+ goto rollback;
}

- ret = sqlite3_prepare_v2(dbh,
- "SELECT value FROM parameters WHERE key == \"version\";",
- -1, &stmt, NULL);
+ ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err);
if (ret != SQLITE_OK) {
- xlog(L_ERROR, "Unable to prepare select statement: %d", ret);
- goto out_err;
+ xlog(L_ERROR, "Unable to commit transaction: %s", err);
+ goto rollback;
}
+out:
+ sqlite3_free(err);
+ return ret;

- /* check schema version */
- ret = sqlite3_step(stmt);
- if (ret != SQLITE_ROW) {
- xlog(L_ERROR, "Select statement execution failed: %s",
- sqlite3_errmsg(dbh));
- goto out_err;
- }
+rollback:
+ /* Attempt to rollback the transaction */
+ sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err);
+ goto out;
+}

- /* process SELECT result */
- ret = sqlite3_column_int(stmt, 0);
- if (ret != CLTRACK_SQLITE_CURRENT_SCHEMA_VERSION) {
- xlog(L_ERROR, "Unsupported database schema version! "
- "Expected %d, got %d.",
- CLTRACK_SQLITE_CURRENT_SCHEMA_VERSION, ret);
- ret = -EINVAL;
- goto out_err;
- }
+/* Open the database and set up the database handle for it */
+int
+sqlite_prepare_dbh(const char *topdir)
+{
+ int ret;

- /* now create the "clients" table */
- ret = sqlite3_exec(dbh, "CREATE TABLE IF NOT EXISTS clients "
- "(id BLOB PRIMARY KEY, time INTEGER);",
- NULL, NULL, &err);
+ /* Do nothing if the database handle is already set up */
+ if (dbh)
+ return 0;
+
+ ret = snprintf(buf, PATH_MAX - 1, "%s/main.sqlite", topdir);
+ if (ret < 0)
+ return ret;
+
+ buf[PATH_MAX - 1] = '\0';
+
+ /* open a new DB handle */
+ ret = sqlite3_open(buf, &dbh);
if (ret != SQLITE_OK) {
- xlog(L_ERROR, "Unable to create clients table: %s", err);
- goto out_err;
+ /* try to create the dir */
+ ret = mkdir_if_not_exist(topdir);
+ if (ret)
+ goto out_close;
+
+ /* retry open */
+ ret = sqlite3_open(buf, &dbh);
+ if (ret != SQLITE_OK)
+ goto out_close;
}

- sqlite3_free(err);
- sqlite3_finalize(stmt);
- return 0;
+ /* set busy timeout */
+ ret = sqlite3_busy_timeout(dbh, CLTRACK_SQLITE_BUSY_TIMEOUT);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "Unable to set sqlite busy timeout: %s",
+ sqlite3_errstr(ret));
+ goto out_close;
+ }

-out_err:
- if (err) {
- xlog(L_ERROR, "sqlite error: %s", err);
- sqlite3_free(err);
+ ret = sqlite_query_schema_version();
+ switch (ret) {
+ case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION:
+ /* DB is already set up. Do nothing */
+ ret = 0;
+ break;
+ case 0:
+ /* Query failed -- try to set up new DB */
+ ret = sqlite_maindb_init_v1();
+ if (ret)
+ goto out_close;
+ break;
+ default:
+ /* Unknown DB version -- downgrade? Fail */
+ xlog(L_ERROR, "Unsupported database schema version! "
+ "Expected %d, got %d.",
+ CLTRACK_SQLITE_LATEST_SCHEMA_VERSION, ret);
+ ret = -EINVAL;
+ goto out_close;
}
- sqlite3_finalize(stmt);
- sqlite3_close(dbh);
+
+ return ret;
+out_close:
+ sqlite3_close_v2(dbh);
+ dbh = NULL;
return ret;
}

diff --git a/utils/nfsdcltrack/sqlite.h b/utils/nfsdcltrack/sqlite.h
index ebf04c3cdbaa..3713bfb66e2f 100644
--- a/utils/nfsdcltrack/sqlite.h
+++ b/utils/nfsdcltrack/sqlite.h
@@ -21,7 +21,6 @@
#define _SQLITE_H_

int sqlite_prepare_dbh(const char *topdir);
-int sqlite_maindb_init(const char *topdir);
int sqlite_insert_client(const unsigned char *clname, const size_t namelen);
int sqlite_remove_client(const unsigned char *clname, const size_t namelen);
int sqlite_check_client(const unsigned char *clname, const size_t namelen);
--
1.9.3


2014-08-15 14:45:25

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 2/7] nfsdcltrack: update comments in sqlite.c

Clean up and fix some inaccuracies.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/nfsdcltrack/sqlite.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/utils/nfsdcltrack/sqlite.c b/utils/nfsdcltrack/sqlite.c
index bac678980938..352f029d55c0 100644
--- a/utils/nfsdcltrack/sqlite.c
+++ b/utils/nfsdcltrack/sqlite.c
@@ -21,17 +21,15 @@
* Explanation:
*
* This file contains the code to manage the sqlite backend database for the
- * clstated upcall daemon.
+ * nfsdcltrack usermodehelper upcall program.
*
* The main database is called main.sqlite and contains the following tables:
*
* parameters: simple key/value pairs for storing database info
*
- * clients: one column containing a BLOB with the as sent by the client
- * and a timestamp (in epoch seconds) of when the record was
- * established
- *
- * FIXME: should we also record the fsid being accessed?
+ * clients: an "id" column containing a BLOB with the long-form clientid as
+ * sent by the client, a "time" column containing a timestamp (in
+ * epoch seconds) of when the record was last updated.
*/

#ifdef HAVE_CONFIG_H
--
1.9.3


2014-08-15 14:45:33

by Jeff Layton

[permalink] [raw]
Subject: [nfs-utils RFC PATCH 7/7] nfsdcltrack: fetch NFSDCLTRACK_GRACE_START out of environment

Allow the fetching of NFSDCLTRACK_GRACE_START out of environment
variables. If it's present in the "create" or "init" upcalls, then we
can use that to query the database to see whether there are any clients
that have not issued a RECLAIM_COMPLETE since that time.

If there aren't any, then we know that all reclaim activity is now done
and we can then cue the kernel to lift the grace period.

Signed-off-by: Jeff Layton <[email protected]>
---
utils/nfsdcltrack/nfsdcltrack.c | 77 ++++++++++++++++++++++++++++++++++++++++-
utils/nfsdcltrack/sqlite.c | 40 +++++++++++++++++++++
utils/nfsdcltrack/sqlite.h | 1 +
3 files changed, 117 insertions(+), 1 deletion(-)

diff --git a/utils/nfsdcltrack/nfsdcltrack.c b/utils/nfsdcltrack/nfsdcltrack.c
index cbc98c7edc11..6a6b90a7c1a3 100644
--- a/utils/nfsdcltrack/nfsdcltrack.c
+++ b/utils/nfsdcltrack/nfsdcltrack.c
@@ -50,6 +50,8 @@
#define CLD_DEFAULT_STORAGEDIR NFS_STATEDIR "/nfsdcltrack"
#endif

+#define NFSD_END_GRACE_FILE "/proc/fs/nfsd/v4_end_grace"
+
/* defined by RFC 3530 */
#define NFS4_OPAQUE_LIMIT 1024

@@ -211,6 +213,64 @@ cltrack_set_caps(void)
return ret;
}

+/* Inform the kernel that it's OK to lift lockd's grace period */
+static void
+cltrack_lift_grace_period(void)
+{
+ int fd;
+
+ fd = open(NFSD_END_GRACE_FILE, O_WRONLY);
+ if (fd < 0) {
+ /* Don't warn if file isn't present */
+ if (errno != ENOENT)
+ xlog(L_WARNING, "Unable to open %s: %m",
+ NFSD_END_GRACE_FILE);
+ return;
+ }
+
+ if (write(fd, "Y", 1) < 0)
+ xlog(L_WARNING, "Unable to write to %s: %m",
+ NFSD_END_GRACE_FILE);
+
+ close(fd);
+ return;
+}
+
+/*
+ * Fetch the contents of the NFSDCLTRACK_GRACE_START env var. If it's not set
+ * or there's an error converting it to time_t, then return LONG_MAX.
+ */
+static time_t
+cltrack_get_grace_start(void)
+{
+ time_t grace_start;
+ char *end;
+ char *grace_start_str = getenv("NFSDCLTRACK_GRACE_START");
+
+ if (!grace_start_str)
+ return LONG_MAX;
+
+ errno = 0;
+ grace_start = strtol(grace_start_str, &end, 0);
+ /* Problem converting or value is too large? */
+ if (errno)
+ return LONG_MAX;
+
+ return grace_start;
+}
+
+static bool
+cltrack_reclaims_complete(void)
+{
+ time_t grace_start = cltrack_get_grace_start();
+
+ /* Don't query DB if we didn't get a valid boot time */
+ if (grace_start == LONG_MAX)
+ return false;
+
+ return !sqlite_query_reclaiming(grace_start);
+}
+
static int
cltrack_init(const char __attribute__((unused)) *unused)
{
@@ -251,7 +311,11 @@ cltrack_init(const char __attribute__((unused)) *unused)
* stop upcalling until the problem is resolved.
*/
ret = -EACCES;
+ } else {
+ if (cltrack_reclaims_complete())
+ cltrack_lift_grace_period();
}
+
return ret;
}

@@ -284,6 +348,7 @@ cltrack_create(const char *id)
{
int ret;
ssize_t len;
+ unsigned int minorvers;

xlog(D_GENERAL, "%s: create client record.", __func__);

@@ -295,8 +360,18 @@ cltrack_create(const char *id)
if (len < 0)
return (int)len;

- ret = sqlite_insert_client(blob, len, cltrack_get_minorvers());
+ minorvers = cltrack_get_minorvers();
+ ret = sqlite_insert_client(blob, len, minorvers);
+ if (ret)
+ goto out;
+
+ /* No point in querying the DB if this client was v4.0 */
+ if (minorvers == 0)
+ goto out;

+ if (cltrack_reclaims_complete())
+ cltrack_lift_grace_period();
+out:
return ret ? -EREMOTEIO : ret;
}

diff --git a/utils/nfsdcltrack/sqlite.c b/utils/nfsdcltrack/sqlite.c
index 01084bb6c4d8..17437e48ffaa 100644
--- a/utils/nfsdcltrack/sqlite.c
+++ b/utils/nfsdcltrack/sqlite.c
@@ -547,3 +547,43 @@ sqlite_remove_unreclaimed(time_t grace_start)
sqlite3_free(err);
return ret;
}
+
+/*
+ * Are there any clients that are possibly still reclaiming? Return a positive
+ * integer (usually number of clients) if so. If not, then return 0. On any
+ * error, return non-zero.
+ */
+int
+sqlite_query_reclaiming(const time_t grace_start)
+{
+ int ret;
+ sqlite3_stmt *stmt = NULL;
+
+ ret = sqlite3_prepare_v2(dbh, "SELECT count(*) FROM clients WHERE "
+ "reclaim_complete < ?", -1, &stmt, NULL);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "%s: unable to prepare select statement: %s",
+ __func__, sqlite3_errstr(ret));
+ return ret;
+ }
+
+ ret = sqlite3_bind_int64(stmt, 1, (sqlite3_int64)grace_start);
+ if (ret != SQLITE_OK) {
+ xlog(L_ERROR, "%s: bind int64 failed: %s",
+ __func__, sqlite3_errstr(ret));
+ return ret;
+ }
+
+ ret = sqlite3_step(stmt);
+ if (ret != SQLITE_ROW) {
+ xlog(L_ERROR, "%s: unexpected return code from select: %s",
+ __func__, sqlite3_errstr(ret));
+ return ret;
+ }
+
+ ret = sqlite3_column_int(stmt, 0);
+ sqlite3_finalize(stmt);
+ xlog(D_GENERAL, "%s: there are %d clients that have not completed "
+ "reclaim", __func__, ret);
+ return ret;
+}
diff --git a/utils/nfsdcltrack/sqlite.h b/utils/nfsdcltrack/sqlite.h
index e9cc84cbb294..0e1939198367 100644
--- a/utils/nfsdcltrack/sqlite.h
+++ b/utils/nfsdcltrack/sqlite.h
@@ -26,5 +26,6 @@ int sqlite_insert_client(const unsigned char *clname, const size_t namelen,
int sqlite_remove_client(const unsigned char *clname, const size_t namelen);
int sqlite_check_client(const unsigned char *clname, const size_t namelen);
int sqlite_remove_unreclaimed(const time_t grace_start);
+int sqlite_query_reclaiming(const time_t grace_start);

#endif /* _SQLITE_H */
--
1.9.3