2024-03-19 00:44:41

by NeilBrown

[permalink] [raw]
Subject: Question about possible use-after-free in nfs_direct_write_reschedule()


I've been reviewing some nfs/direct.c patches for possible backport to
one of our older enterprise kernels and I've found something that looks
wrong.

It isn't clear to me how to exercise the code so I haven't be able to
trigger a problem. I'm hoping that someone could either explain when
this code runs, or confirm if the code is correct or not.

Commit 954998b60caa ("NFS: Fix error handling for O_DIRECT write scheduling")

adds an extra call to nfs_release_request() but I cannot find any place
that an extra reference is taken.

The code currently reads:

while (!list_empty(&reqs)) {
req = nfs_list_entry(reqs.next);
nfs_list_remove_request(req);
nfs_unlock_and_release_request(req);
if (desc.pg_error == -EAGAIN) {
nfs_mark_request_commit(req, NULL, &cinfo, 0);
} else {
spin_lock(&dreq->lock);
nfs_direct_truncate_request(dreq, req);
spin_unlock(&dreq->lock);
nfs_release_request(req);
}
}

after the nfs_unlock_and_release_request() call I would expect that the
request could be freed, so that nfs_mark_request_commit() or the
nfs_release_request() could cause a problem.

Superficially it looks like the call should be simply
nfs_unlock_request(). This would follow the
list_remove;unlock;mark_commit pattern also found in
nfs_direct_write_reschedule_io().

Do we need:
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -581,7 +581,7 @@ static void nfs_direct_write_reschedule(struct nfs_direct_req *dreq)
while (!list_empty(&reqs)) {
req = nfs_list_entry(reqs.next);
nfs_list_remove_request(req);
- nfs_unlock_and_release_request(req);
+ nfs_unlock_request(req);
if (desc.pg_error == -EAGAIN) {
nfs_mark_request_commit(req, NULL, &cinfo, 0);
} else {

??

Thanks,
NeilBrown


2024-03-19 03:54:21

by Trond Myklebust

[permalink] [raw]
Subject: Re: Question about possible use-after-free in nfs_direct_write_reschedule()

On Tue, 2024-03-19 at 11:36 +1100, NeilBrown wrote:
>
> I've been reviewing some nfs/direct.c patches for possible backport
> to
> one of our older enterprise kernels and I've found something that
> looks
> wrong.
>
> It isn't clear to me how to exercise the code so I haven't be able to
> trigger a problem.  I'm hoping that someone could either explain when
> this code runs, or confirm if the code is correct or not.
>
> Commit 954998b60caa ("NFS: Fix error handling for O_DIRECT write
> scheduling")
>
> adds an extra call to nfs_release_request() but I cannot find any
> place
> that an extra reference is taken.
>
> The code currently reads:
>
> while (!list_empty(&reqs)) {
> req = nfs_list_entry(reqs.next);
> nfs_list_remove_request(req);
> nfs_unlock_and_release_request(req);
> if (desc.pg_error == -EAGAIN) {
> nfs_mark_request_commit(req, NULL, &cinfo,
> 0);
> } else {
> spin_lock(&dreq->lock);
> nfs_direct_truncate_request(dreq, req);
> spin_unlock(&dreq->lock);
> nfs_release_request(req);
> }
> }
>
> after the nfs_unlock_and_release_request() call I would expect that
> the
> request could be freed, so that nfs_mark_request_commit() or the
> nfs_release_request() could cause a problem.
>
> Superficially it looks like the call should be simply
> nfs_unlock_request().  This would follow the
> list_remove;unlock;mark_commit pattern also found in
> nfs_direct_write_reschedule_io().
>
> Do we need:
> --- a/fs/nfs/direct.c
> +++ b/fs/nfs/direct.c
> @@ -581,7 +581,7 @@ static void nfs_direct_write_reschedule(struct
> nfs_direct_req *dreq)
>   while (!list_empty(&reqs)) {
>   req = nfs_list_entry(reqs.next);
>   nfs_list_remove_request(req);
> - nfs_unlock_and_release_request(req);
> + nfs_unlock_request(req);
>   if (desc.pg_error == -EAGAIN) {
>   nfs_mark_request_commit(req, NULL, &cinfo,
> 0);
>   } else {

See the full code that was changed:

- list_for_each_entry_safe(req, tmp, &reqs, wb_list) {
+ while (!list_empty(&reqs)) {
+ req = nfs_list_entry(reqs.next);
/* Bump the transmission count */
req->wb_nio++;
if (!nfs_pageio_add_request(&desc, req)) {
- nfs_list_move_request(req, &failed);
spin_lock(&cinfo.inode->i_lock);
- dreq->flags = 0;
- if (desc.pg_error < 0)
+ if (dreq->error < 0) {
+ desc.pg_error = dreq->error;
+ } else if (desc.pg_error != -EAGAIN) {
+ dreq->flags = 0;
+ if (!desc.pg_error)
+ desc.pg_error = -EIO;
dreq->error = desc.pg_error;
- else
- dreq->error = -EIO;
+ } else
+ dreq->flags =
NFS_ODIRECT_RESCHED_WRITES;
spin_unlock(&cinfo.inode->i_lock);
+ break;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Prior to this patch, we did not break out of the loop until the entire
"reqs" list hand been handled.

}
nfs_release_request(req);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Prior to this patch, every request on the "reqs" list was released,
whether or not they were being moved to the "failed" list.
}
nfs_pageio_complete(&desc);

- while (!list_empty(&failed)) {
- req = nfs_list_entry(failed.next);
+ while (!list_empty(&reqs)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Prior to this patch, every request that was being handled here had
already seen a call to nfs_release_request() because we had already
gone through the entire list of "reqs".
With this patch applied, we're now handling all the requests that are
left on "reqs", and that have not been released.

+ req = nfs_list_entry(reqs.next);
nfs_list_remove_request(req);
nfs_unlock_and_release_request(req);
+ if (desc.pg_error == -EAGAIN)
+ nfs_mark_request_commit(req, NULL, &cinfo, 0);
+ else
+ nfs_release_request(req);
}


>
> ??
>
> Thanks,
> NeilBrown

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2024-03-19 05:12:41

by NeilBrown

[permalink] [raw]
Subject: Re: Question about possible use-after-free in nfs_direct_write_reschedule()

On Tue, 19 Mar 2024, Trond Myklebust wrote:
> On Tue, 2024-03-19 at 11:36 +1100, NeilBrown wrote:
> >
> > I've been reviewing some nfs/direct.c patches for possible backport
> > to
> > one of our older enterprise kernels and I've found something that
> > looks
> > wrong.
> >
> > It isn't clear to me how to exercise the code so I haven't be able to
> > trigger a problem.  I'm hoping that someone could either explain when
> > this code runs, or confirm if the code is correct or not.
> >
> > Commit 954998b60caa ("NFS: Fix error handling for O_DIRECT write
> > scheduling")
> >
> > adds an extra call to nfs_release_request() but I cannot find any
> > place
> > that an extra reference is taken.
> >
> > The code currently reads:
> >
> > while (!list_empty(&reqs)) {
> > req = nfs_list_entry(reqs.next);
> > nfs_list_remove_request(req);
> > nfs_unlock_and_release_request(req);
> > if (desc.pg_error == -EAGAIN) {
> > nfs_mark_request_commit(req, NULL, &cinfo,
> > 0);
> > } else {
> > spin_lock(&dreq->lock);
> > nfs_direct_truncate_request(dreq, req);
> > spin_unlock(&dreq->lock);
> > nfs_release_request(req);
> > }
> > }
> >
> > after the nfs_unlock_and_release_request() call I would expect that
> > the
> > request could be freed, so that nfs_mark_request_commit() or the
> > nfs_release_request() could cause a problem.
> >
> > Superficially it looks like the call should be simply
> > nfs_unlock_request().  This would follow the
> > list_remove;unlock;mark_commit pattern also found in
> > nfs_direct_write_reschedule_io().
> >
> > Do we need:
> > --- a/fs/nfs/direct.c
> > +++ b/fs/nfs/direct.c
> > @@ -581,7 +581,7 @@ static void nfs_direct_write_reschedule(struct
> > nfs_direct_req *dreq)
> >   while (!list_empty(&reqs)) {
> >   req = nfs_list_entry(reqs.next);
> >   nfs_list_remove_request(req);
> > - nfs_unlock_and_release_request(req);
> > + nfs_unlock_request(req);
> >   if (desc.pg_error == -EAGAIN) {
> >   nfs_mark_request_commit(req, NULL, &cinfo,
> > 0);
> >   } else {
>
> See the full code that was changed:
>
> - list_for_each_entry_safe(req, tmp, &reqs, wb_list) {
> + while (!list_empty(&reqs)) {
> + req = nfs_list_entry(reqs.next);
> /* Bump the transmission count */
> req->wb_nio++;
> if (!nfs_pageio_add_request(&desc, req)) {
> - nfs_list_move_request(req, &failed);
> spin_lock(&cinfo.inode->i_lock);
> - dreq->flags = 0;
> - if (desc.pg_error < 0)
> + if (dreq->error < 0) {
> + desc.pg_error = dreq->error;
> + } else if (desc.pg_error != -EAGAIN) {
> + dreq->flags = 0;
> + if (!desc.pg_error)
> + desc.pg_error = -EIO;
> dreq->error = desc.pg_error;
> - else
> - dreq->error = -EIO;
> + } else
> + dreq->flags =
> NFS_ODIRECT_RESCHED_WRITES;
> spin_unlock(&cinfo.inode->i_lock);
> + break;
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Prior to this patch, we did not break out of the loop until the entire
> "reqs" list hand been handled.
>
> }
> nfs_release_request(req);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Prior to this patch, every request on the "reqs" list was released,
> whether or not they were being moved to the "failed" list.
> }
> nfs_pageio_complete(&desc);
>
> - while (!list_empty(&failed)) {
> - req = nfs_list_entry(failed.next);
> + while (!list_empty(&reqs)) {
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Prior to this patch, every request that was being handled here had
> already seen a call to nfs_release_request() because we had already
> gone through the entire list of "reqs".
> With this patch applied, we're now handling all the requests that are
> left on "reqs", and that have not been released.

Ahhh - I get it now - thanks a lot for the explanation.

NeilBrown


>
> + req = nfs_list_entry(reqs.next);
> nfs_list_remove_request(req);
> nfs_unlock_and_release_request(req);
> + if (desc.pg_error == -EAGAIN)
> + nfs_mark_request_commit(req, NULL, &cinfo, 0);
> + else
> + nfs_release_request(req);
> }
>
>
> >
> > ??
> >
> > Thanks,
> > NeilBrown
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>
>