2014-04-27 21:17:51

by Oleg Drokin

[permalink] [raw]
Subject: [PATCH 0/5] Lustre Klocwork fixes

This is just splitting big lnet fixes patch from Klocwork static
analysis tool into smaller bits.

Dmitry Eremin (5):
staging/lustre/lnet: fix potential null pointer dereference
staging/lustre/lnet: remove unused variable
staging/lustre/lnet: fix potential null pointer dereference
staging/lustre/lnet: fix potential null pointer dereference
staging/lustre/lnet: fix potential null pointer dereference

drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 8 ++++++--
drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 ++--
drivers/staging/lustre/lnet/lnet/api-ni.c | 6 +++---
drivers/staging/lustre/lnet/lnet/router.c | 3 ++-
drivers/staging/lustre/lnet/selftest/framework.c | 14 +++++++++++---
5 files changed, 24 insertions(+), 11 deletions(-)

--
1.9.0


2014-04-27 21:17:45

by Oleg Drokin

[permalink] [raw]
Subject: [PATCH 3/5] staging/lustre/lnet: fix potential null pointer dereference

From: Dmitry Eremin <[email protected]>

Pointer 'tsc' returned from call to function 'sfw_find_test_case'
at line 571 may be NULL and will be dereferenced at line 572.
found by Klocwork Insight tool

Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/9386
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
Reviewed-by: John L. Hammond <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lnet/selftest/framework.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
index 050723a..c141f93 100644
--- a/drivers/staging/lustre/lnet/selftest/framework.c
+++ b/drivers/staging/lustre/lnet/selftest/framework.c
@@ -547,10 +547,16 @@ sfw_test_rpc_fini (srpc_client_rpc_t *rpc)
static inline int
sfw_test_buffers(sfw_test_instance_t *tsi)
{
- struct sfw_test_case *tsc = sfw_find_test_case(tsi->tsi_service);
- struct srpc_service *svc = tsc->tsc_srv_service;
+ struct sfw_test_case *tsc;
+ struct srpc_service *svc;
int nbuf;

+ LASSERT(tsi != NULL);
+ tsc = sfw_find_test_case(tsi->tsi_service);
+ LASSERT(tsc != NULL);
+ svc = tsc->tsc_srv_service;
+ LASSERT(svc != NULL);
+
nbuf = min(svc->sv_wi_total, tsi->tsi_loop) / svc->sv_ncpts;
return max(SFW_TEST_WI_MIN, nbuf + SFW_TEST_WI_EXTRA);
}
@@ -595,8 +601,10 @@ sfw_load_test(struct sfw_test_instance *tsi)
void
sfw_unload_test(struct sfw_test_instance *tsi)
{
- struct sfw_test_case *tsc = sfw_find_test_case(tsi->tsi_service);
+ struct sfw_test_case *tsc;

+ LASSERT(tsi != NULL);
+ tsc = sfw_find_test_case(tsi->tsi_service);
LASSERT(tsc != NULL);

if (tsi->tsi_is_client)
--
1.9.0

2014-04-27 21:17:49

by Oleg Drokin

[permalink] [raw]
Subject: [PATCH 5/5] staging/lustre/lnet: fix potential null pointer dereference

From: Dmitry Eremin <[email protected]>

Pointer 'ni' checked for NULL at line 1569 may be passed to
function and may be dereferenced there by passing argument 1 to
function 'lnet_ni_notify_locked' at line 1621.
found by Klocwork Insight tool

Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/9386
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
Reviewed-by: John L. Hammond <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lnet/lnet/router.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 995f509..ba0278e 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -1559,7 +1559,8 @@ lnet_notify(lnet_ni_t *ni, lnet_nid_t nid, int alive, cfs_time_t when)

lnet_notify_locked(lp, ni == NULL, alive, when);

- lnet_ni_notify_locked(ni, lp);
+ if (ni != NULL)
+ lnet_ni_notify_locked(ni, lp);

lnet_peer_decref_locked(lp);

--
1.9.0

2014-04-27 21:18:36

by Oleg Drokin

[permalink] [raw]
Subject: [PATCH 4/5] staging/lustre/lnet: fix potential null pointer dereference

From: Dmitry Eremin <[email protected]>

Null pointer 'cp' that comes from line 2544 may be dereferenced
at line 2618.
found by Klocwork Insight tool

Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/9386
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
Reviewed-by: John L. Hammond <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 6173e74..9bf6c94 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -2609,13 +2609,17 @@ kiblnd_rejected (kib_conn_t *conn, int reason, void *priv, int priv_nob)

case IBLND_REJECT_MSG_QUEUE_SIZE:
CERROR("%s rejected: incompatible message queue depth %d, %d\n",
- libcfs_nid2str(peer->ibp_nid), cp->ibcp_queue_depth,
+ libcfs_nid2str(peer->ibp_nid),
+ cp != NULL ? cp->ibcp_queue_depth :
+ IBLND_MSG_QUEUE_SIZE(rej->ibr_version),
IBLND_MSG_QUEUE_SIZE(conn->ibc_version));
break;

case IBLND_REJECT_RDMA_FRAGS:
CERROR("%s rejected: incompatible # of RDMA fragments %d, %d\n",
- libcfs_nid2str(peer->ibp_nid), cp->ibcp_max_frags,
+ libcfs_nid2str(peer->ibp_nid),
+ cp != NULL ? cp->ibcp_max_frags :
+ IBLND_RDMA_FRAGS(rej->ibr_version),
IBLND_RDMA_FRAGS(conn->ibc_version));
break;

--
1.9.0

2014-04-27 21:17:43

by Oleg Drokin

[permalink] [raw]
Subject: [PATCH 1/5] staging/lustre/lnet: fix potential null pointer dereference

From: Dmitry Eremin <[email protected]>

Null pointer 'best_iface' that comes from line 802 may be
dereferenced at line 832.
found by Klocwork Insight tool

Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/9386
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
Reviewed-by: John L. Hammond <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index 21d36ee..516f623 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -829,14 +829,14 @@ ksocknal_select_ips(ksock_peer_t *peer, __u32 *peerips, int n_peerips)
best_npeers = iface->ksni_npeers;
}

+ LASSERT(best_iface != NULL);
+
best_iface->ksni_npeers++;
ip = best_iface->ksni_ipaddr;
peer->ksnp_passive_ips[i] = ip;
peer->ksnp_n_passive_ips = i+1;
}

- LASSERT (best_iface != NULL);
-
/* mark the best matching peer IP used */
j = ksocknal_match_peerip(best_iface, peerips, n_peerips);
peerips[j] = 0;
--
1.9.0

2014-04-27 21:19:07

by Oleg Drokin

[permalink] [raw]
Subject: [PATCH 2/5] staging/lustre/lnet: remove unused variable

From: Dmitry Eremin <[email protected]>

Local variable 'hash' is never used
found by Klocwork Insight tool

Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/9386
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
Reviewed-by: John L. Hammond <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lnet/lnet/api-ni.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lnet/lnet/api-ni.c b/drivers/staging/lustre/lnet/lnet/api-ni.c
index 85b8d81..3f1fdaa 100644
--- a/drivers/staging/lustre/lnet/lnet/api-ni.c
+++ b/drivers/staging/lustre/lnet/lnet/api-ni.c
@@ -127,8 +127,7 @@ lnet_create_remote_nets_table(void)
static void
lnet_destroy_remote_nets_table(void)
{
- int i;
- struct list_head *hash;
+ int i;

if (the_lnet.ln_remote_nets_hash == NULL)
return;
@@ -137,7 +136,8 @@ lnet_destroy_remote_nets_table(void)
LASSERT(list_empty(&the_lnet.ln_remote_nets_hash[i]));

LIBCFS_FREE(the_lnet.ln_remote_nets_hash,
- LNET_REMOTE_NETS_HASH_SIZE * sizeof(*hash));
+ LNET_REMOTE_NETS_HASH_SIZE *
+ sizeof(the_lnet.ln_remote_nets_hash[0]));
the_lnet.ln_remote_nets_hash = NULL;
}

--
1.9.0

2014-04-27 22:06:04

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH 1/5] staging/lustre/lnet: fix potential null pointer dereference

> Reviewed-by: John L. Hammond <[email protected]>
> Reviewed-by: Isaac Huang <[email protected]>
> Signed-off-by: Oleg Drokin <[email protected]>
> ---
> drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> index 21d36ee..516f623 100644
> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> @@ -829,14 +829,14 @@ ksocknal_select_ips(ksock_peer_t *peer, __u32 *peerips, int n_peerips)
> best_npeers = iface->ksni_npeers;
> }
>
> + LASSERT(best_iface != NULL);
> +

And this solves the problem how ???

These dont' appear that helpful. If we dereference a NULL point we'll
already get a suitably spectacular oops and call trace dump.

Alan

2014-04-27 22:20:19

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH 1/5] staging/lustre/lnet: fix potential null pointer dereference

On Apr 27, 2014, at 6:05 PM, One Thousand Gnomes wrote:

>> Reviewed-by: John L. Hammond <[email protected]>
>> Reviewed-by: Isaac Huang <[email protected]>
>> Signed-off-by: Oleg Drokin <[email protected]>
>> ---
>> drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
>> index 21d36ee..516f623 100644
>> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
>> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
>> @@ -829,14 +829,14 @@ ksocknal_select_ips(ksock_peer_t *peer, __u32 *peerips, int n_peerips)
>> best_npeers = iface->ksni_npeers;
>> }
>>
>> + LASSERT(best_iface != NULL);
>> +
>
> And this solves the problem how ???

This does not really solve anything.
Just moves the check to where it's actually going to do anything instead of the check being guarded by the NULL pointer access earlier on.
Could have been removed instead of course to just get an oops at all times.-

2014-04-27 22:36:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/5] Lustre Klocwork fixes

On Sun, Apr 27, 2014 at 05:17:21PM -0400, Oleg Drokin wrote:
> This is just splitting big lnet fixes patch from Klocwork static
> analysis tool into smaller bits.
>
> Dmitry Eremin (5):
> staging/lustre/lnet: fix potential null pointer dereference
> staging/lustre/lnet: remove unused variable
> staging/lustre/lnet: fix potential null pointer dereference
> staging/lustre/lnet: fix potential null pointer dereference
> staging/lustre/lnet: fix potential null pointer dereference

You can't send the same subject: for all patches, that's not good.
Please be more specific.

greg k-h

2014-04-27 22:38:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/5] staging/lustre/lnet: fix potential null pointer dereference

On Sun, Apr 27, 2014 at 06:19:59PM -0400, Oleg Drokin wrote:
> On Apr 27, 2014, at 6:05 PM, One Thousand Gnomes wrote:
>
> >> Reviewed-by: John L. Hammond <[email protected]>
> >> Reviewed-by: Isaac Huang <[email protected]>
> >> Signed-off-by: Oleg Drokin <[email protected]>
> >> ---
> >> drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> >> index 21d36ee..516f623 100644
> >> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> >> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> >> @@ -829,14 +829,14 @@ ksocknal_select_ips(ksock_peer_t *peer, __u32 *peerips, int n_peerips)
> >> best_npeers = iface->ksni_npeers;
> >> }
> >>
> >> + LASSERT(best_iface != NULL);
> >> +
> >
> > And this solves the problem how ???
>
> This does not really solve anything.
> Just moves the check to where it's actually going to do anything instead of the check being guarded by the NULL pointer access earlier on.
> Could have been removed instead of course to just get an oops at all times.

Then please just remove it, or, even better yet, fix it properly if
there is a way that this pointer can be NULL.

greg k-h

2014-04-27 22:38:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/5] staging/lustre/lnet: fix potential null pointer dereference

On Sun, Apr 27, 2014 at 05:17:26PM -0400, Oleg Drokin wrote:
> From: Dmitry Eremin <[email protected]>
>
> Pointer 'ni' checked for NULL at line 1569 may be passed to
> function and may be dereferenced there by passing argument 1 to
> function 'lnet_ni_notify_locked' at line 1621.
> found by Klocwork Insight tool
>
> Signed-off-by: Dmitry Eremin <[email protected]>
> Reviewed-on: http://review.whamcloud.com/9386
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
> Reviewed-by: John L. Hammond <[email protected]>
> Reviewed-by: Isaac Huang <[email protected]>
> Signed-off-by: Oleg Drokin <[email protected]>
> ---
> drivers/staging/lustre/lnet/lnet/router.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
> index 995f509..ba0278e 100644
> --- a/drivers/staging/lustre/lnet/lnet/router.c
> +++ b/drivers/staging/lustre/lnet/lnet/router.c
> @@ -1559,7 +1559,8 @@ lnet_notify(lnet_ni_t *ni, lnet_nid_t nid, int alive, cfs_time_t when)
>
> lnet_notify_locked(lp, ni == NULL, alive, when);
>
> - lnet_ni_notify_locked(ni, lp);
> + if (ni != NULL)
> + lnet_ni_notify_locked(ni, lp);

Why can't lnet_ni_notify_locked() accept NULL as an input?

2014-04-27 22:39:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3/5] staging/lustre/lnet: fix potential null pointer dereference

On Sun, Apr 27, 2014 at 05:17:24PM -0400, Oleg Drokin wrote:
> From: Dmitry Eremin <[email protected]>
>
> Pointer 'tsc' returned from call to function 'sfw_find_test_case'
> at line 571 may be NULL and will be dereferenced at line 572.
> found by Klocwork Insight tool
>
> Signed-off-by: Dmitry Eremin <[email protected]>
> Reviewed-on: http://review.whamcloud.com/9386
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4629
> Reviewed-by: John L. Hammond <[email protected]>
> Reviewed-by: Isaac Huang <[email protected]>
> Signed-off-by: Oleg Drokin <[email protected]>
> ---
> drivers/staging/lustre/lnet/selftest/framework.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/staging/lustre/lnet/selftest/framework.c b/drivers/staging/lustre/lnet/selftest/framework.c
> index 050723a..c141f93 100644
> --- a/drivers/staging/lustre/lnet/selftest/framework.c
> +++ b/drivers/staging/lustre/lnet/selftest/framework.c
> @@ -547,10 +547,16 @@ sfw_test_rpc_fini (srpc_client_rpc_t *rpc)
> static inline int
> sfw_test_buffers(sfw_test_instance_t *tsi)
> {
> - struct sfw_test_case *tsc = sfw_find_test_case(tsi->tsi_service);
> - struct srpc_service *svc = tsc->tsc_srv_service;
> + struct sfw_test_case *tsc;
> + struct srpc_service *svc;
> int nbuf;
>
> + LASSERT(tsi != NULL);

Asserts suck rocks, which is why we avoid them at all costs. Please
just fix the code to work properly. If that parameter can be NULL, then
properly handle the error and return, don't oops the kernel.

greg k-h

2014-04-27 23:28:42

by Oleg Drokin

[permalink] [raw]
Subject: Re: [PATCH 5/5] staging/lustre/lnet: fix potential null pointer dereference

Hello!

On Apr 27, 2014, at 6:39 PM, Greg Kroah-Hartman wrote:
>>
>> - lnet_ni_notify_locked(ni, lp);
>> + if (ni != NULL)
>> + lnet_ni_notify_locked(ni, lp);
>
> Why can't lnet_ni_notify_locked() accept NULL as an input?

It makes no sense, because then there is nowhere to send the notification.
That said, it appears a race is possible where one caller updated let_peer structure to ask for a notification
and then we fell through here with a NULL ni and called into lnet_ni_notify_locked
where we'd try to dereference this NULL ni.
But this is the only called that accepts separate ni and lp, where as the only other caller gets them from the same struct
where they are updated more in sync.

I guess it makes sense to update lnet_ni_notify_locked as a future-proofing excercise.

Thanks, I'll update this patch.

Bye,
Oleg-