From: Oleg Drokin <[email protected]>
These 20 patches represent various lustre fixes mostly taken from
development lustre tree.
It also unbreaks lustre flock functionality accdentally broken in
the staging tree by a cleanup.
Please consider.
Alexander Boyko (1):
staging/lustre/osc: split different type of IO
Dmitry Eremin (2):
staging/lustre: fix comparison between signed and unsigned
staging/lustre/obd: change type of cl_conn_count to size_t
Frank Zago (1):
staging/lustre/libcfs: protect kkuc_groups from write access
Henri Doreau (1):
staging/lustre/obdclass: Proper swabbing of llog_rec_tail.
Hongchao Zhang (1):
staging/lustre/obdclass: fix a race in recovery
Jinshan Xiong (2):
staging/lustre/llite: to configure max_cached_mb correctly
staging/lustre/llite: Solve a race to access lli_has_smd in read case
John L. Hammond (1):
staging/lustre/llite: remove llite proc root on init failure
Lai Siyao (1):
staging/lustre/llite: don't add to page cache upon failure
Li Xi (2):
staging/lustre/ldlm: high load because of negative timeout
staging/lustre/libcfs: fix illegal page access of tracefiled()
Liang Zhen (3):
staging/lustre/ptlrpc: avoid list scan in ptlrpcd_check
staging/lustre/lnet: peer aliveness status and NI status
staging/lustre/lnet: portal spreading rotor should be unsigned
Niu Yawei (1):
staging/lustre/ptlrpc: hold rq_lock when modify rq_flags
Oleg Drokin (1):
staging/lustre/mdc: Initialize req in mdc_enqueue for !it case
Patrick Farrell (1):
staging/lustre/clio: Do not allow group locks with gid 0
Yang Sheng (1):
staging/lustre/llite: Add exception entry check after radix_tree
wang di (1):
staging/lustre/fld: refer to MDT0 for fld lookup in some cases
.../staging/lustre/include/linux/lnet/lib-lnet.h | 10 ++
.../staging/lustre/include/linux/lnet/lib-types.h | 2 +-
drivers/staging/lustre/lnet/lnet/lib-move.c | 13 ++
drivers/staging/lustre/lnet/lnet/lib-ptl.c | 8 +-
drivers/staging/lustre/lnet/lnet/router.c | 17 ++-
drivers/staging/lustre/lustre/fld/fld_request.c | 9 ++
drivers/staging/lustre/lustre/include/lclient.h | 1 +
.../lustre/lustre/include/lustre/lustre_user.h | 2 +-
drivers/staging/lustre/lustre/include/obd.h | 2 +-
drivers/staging/lustre/lustre/lclient/lcommon_cl.c | 6 +
drivers/staging/lustre/lustre/ldlm/ldlm_lib.c | 6 +-
drivers/staging/lustre/lustre/ldlm/ldlm_pool.c | 32 ++++-
.../lustre/lustre/libcfs/kernel_user_comm.c | 4 +-
drivers/staging/lustre/lustre/libcfs/tracefile.c | 1 +
drivers/staging/lustre/lustre/llite/dir.c | 7 +-
drivers/staging/lustre/lustre/llite/file.c | 5 +
drivers/staging/lustre/lustre/llite/lproc_llite.c | 8 +-
drivers/staging/lustre/lustre/llite/super25.c | 141 +++++++++++----------
drivers/staging/lustre/lustre/llite/vvp_io.c | 15 +--
drivers/staging/lustre/lustre/llite/vvp_lock.c | 1 +
drivers/staging/lustre/lustre/mdc/mdc_locks.c | 1 +
drivers/staging/lustre/lustre/mgc/mgc_request.c | 4 +-
drivers/staging/lustre/lustre/obdclass/genops.c | 10 +-
drivers/staging/lustre/lustre/obdclass/llog_swab.c | 5 +-
drivers/staging/lustre/lustre/osc/osc_cache.c | 12 ++
drivers/staging/lustre/lustre/ptlrpc/client.c | 14 +-
drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 23 ++--
27 files changed, 234 insertions(+), 125 deletions(-)
--
2.1.0
From: Liang Zhen <[email protected]>
ptlrpcd_check() always scan all requests on ptlrpc_request_set
and try to finish completed requests, this is low efficiency.
Even worse, l_wait_event() always checks condition for twice
before sleeping and one more time after waking up, which means
it will call ptlrpcd_check() for three times in each loop.
This patch will move completed requests at the head of list
in ptlrpc_check_set(), with this change ptlrpcd_check doesn't
need to scan all requests anymore.
Signed-off-by: Liang Zhen <[email protected]>
Reviewed-on: http://review.whamcloud.com/11513
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5548
Reviewed-by: Bobi Jam <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Johann Lombardi <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/ptlrpc/client.c | 12 +++++++++++-
drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 23 +++++++++--------------
2 files changed, 20 insertions(+), 15 deletions(-)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index dc9e406..8c1ec83 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1497,11 +1497,13 @@ static inline int ptlrpc_set_producer(struct ptlrpc_request_set *set)
int ptlrpc_check_set(const struct lu_env *env, struct ptlrpc_request_set *set)
{
struct list_head *tmp, *next;
+ struct list_head comp_reqs;
int force_timer_recalc = 0;
if (atomic_read(&set->set_remaining) == 0)
return 1;
+ INIT_LIST_HEAD(&comp_reqs);
list_for_each_safe(tmp, next, &set->set_requests) {
struct ptlrpc_request *req =
list_entry(tmp, struct ptlrpc_request,
@@ -1576,8 +1578,10 @@ int ptlrpc_check_set(const struct lu_env *env, struct ptlrpc_request_set *set)
ptlrpc_rqphase_move(req, req->rq_next_phase);
}
- if (req->rq_phase == RQ_PHASE_COMPLETE)
+ if (req->rq_phase == RQ_PHASE_COMPLETE) {
+ list_move_tail(&req->rq_set_chain, &comp_reqs);
continue;
+ }
if (req->rq_phase == RQ_PHASE_INTERPRET)
goto interpret;
@@ -1860,9 +1864,15 @@ interpret:
if (req->rq_status != 0)
set->set_rc = req->rq_status;
ptlrpc_req_finished(req);
+ } else {
+ list_move_tail(&req->rq_set_chain, &comp_reqs);
}
}
+ /* move completed request at the head of list so it's easier for
+ * caller to find them */
+ list_splice(&comp_reqs, &set->set_requests);
+
/* If we hit an error, we want to recover promptly. */
return atomic_read(&set->set_remaining) == 0 || force_timer_recalc;
}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index cbcc541..4621b71 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -306,21 +306,16 @@ static int ptlrpcd_check(struct lu_env *env, struct ptlrpcd_ctl *pc)
if (atomic_read(&set->set_remaining))
rc |= ptlrpc_check_set(env, set);
- if (!list_empty(&set->set_requests)) {
- /*
- * XXX: our set never completes, so we prune the completed
- * reqs after each iteration. boy could this be smarter.
- */
- list_for_each_safe(pos, tmp, &set->set_requests) {
- req = list_entry(pos, struct ptlrpc_request,
- rq_set_chain);
- if (req->rq_phase != RQ_PHASE_COMPLETE)
- continue;
+ /* NB: ptlrpc_check_set has already moved completed request at the
+ * head of seq::set_requests */
+ list_for_each_safe(pos, tmp, &set->set_requests) {
+ req = list_entry(pos, struct ptlrpc_request, rq_set_chain);
+ if (req->rq_phase != RQ_PHASE_COMPLETE)
+ break;
- list_del_init(&req->rq_set_chain);
- req->rq_set = NULL;
- ptlrpc_req_finished(req);
- }
+ list_del_init(&req->rq_set_chain);
+ req->rq_set = NULL;
+ ptlrpc_req_finished(req);
}
if (rc == 0) {
--
2.1.0
From: Alexander Boyko <[email protected]>
Do not allow different type of pages at the same rpc.
Signed-off-by: Alexander Boyko <[email protected]>
Signed-off-by: Vitaly Fertman <[email protected]>
Xyratex-bug-id: MRP-859
Reviewed-on: http://review.whamcloud.com/10930
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3192
Reviewed-by: Jinshan Xiong <[email protected]>
Reviewed-by: Niu Yawei <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/osc/osc_cache.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 370e6d4..7022ed4 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -1820,6 +1820,9 @@ static int try_to_add_extent_for_io(struct client_obd *cli,
int *pc, unsigned int *max_pages)
{
struct osc_extent *tmp;
+ struct osc_async_page *oap = list_first_entry(&ext->oe_pages,
+ struct osc_async_page,
+ oap_pending_item);
EASSERT((ext->oe_state == OES_CACHE || ext->oe_state == OES_LOCK_DONE),
ext);
@@ -1829,6 +1832,10 @@ static int try_to_add_extent_for_io(struct client_obd *cli,
return 0;
list_for_each_entry(tmp, rpclist, oe_link) {
+ struct osc_async_page *oap2;
+
+ oap2 = list_first_entry(&tmp->oe_pages, struct osc_async_page,
+ oap_pending_item);
EASSERT(tmp->oe_owner == current, tmp);
#if 0
if (overlapped(tmp, ext)) {
@@ -1836,6 +1843,11 @@ static int try_to_add_extent_for_io(struct client_obd *cli,
EASSERT(0, ext);
}
#endif
+ if (oap2cl_page(oap)->cp_type != oap2cl_page(oap2)->cp_type) {
+ CDEBUG(D_CACHE, "Do not permit different type of IO"
+ " for a same RPC\n");
+ return 0;
+ }
if (tmp->oe_srvlock != ext->oe_srvlock ||
!tmp->oe_grants != !ext->oe_grants)
--
2.1.0
From: Li Xi <[email protected]>
When the time of LRU resizing exceeds waiting period of
recalculation, the ldlm daemon will keep on resizing without any
interval of rest. That will cause high CPU load.
This patch fixes the problem by setting the recalculation timestamp
after LRU resizing finishes rather than before it. What is more, an
interval of one second is enforced between each recalculation.
Signed-off-by: Li Xi <[email protected]>
Reviewed-on: http://review.whamcloud.com/11227
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5415
Reviewed-by: Dmitry Eremin <[email protected]>
Reviewed-by: Bobi Jam <[email protected]>
Reviewed-by: Lai Siyao <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/ldlm/ldlm_pool.c | 32 +++++++++++++++++++++-----
1 file changed, 26 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
index 142b3dd..d20d277 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
@@ -470,6 +470,7 @@ static void ldlm_cli_pool_pop_slv(struct ldlm_pool *pl)
static int ldlm_cli_pool_recalc(struct ldlm_pool *pl)
{
time_t recalc_interval_sec;
+ int ret;
recalc_interval_sec = get_seconds() - pl->pl_recalc_time;
if (recalc_interval_sec < pl->pl_recalc_period)
@@ -490,16 +491,15 @@ static int ldlm_cli_pool_recalc(struct ldlm_pool *pl)
*/
ldlm_cli_pool_pop_slv(pl);
- pl->pl_recalc_time = get_seconds();
- lprocfs_counter_add(pl->pl_stats, LDLM_POOL_TIMING_STAT,
- recalc_interval_sec);
spin_unlock(&pl->pl_lock);
/*
* Do not cancel locks in case lru resize is disabled for this ns.
*/
- if (!ns_connect_lru_resize(ldlm_pl2ns(pl)))
- return 0;
+ if (!ns_connect_lru_resize(ldlm_pl2ns(pl))) {
+ ret = 0;
+ goto out;
+ }
/*
* In the time of canceling locks on client we do not need to maintain
@@ -507,7 +507,19 @@ static int ldlm_cli_pool_recalc(struct ldlm_pool *pl)
* It may be called when SLV has changed much, this is why we do not
* take into account pl->pl_recalc_time here.
*/
- return ldlm_cancel_lru(ldlm_pl2ns(pl), 0, LCF_ASYNC, LDLM_CANCEL_LRUR);
+ ret = ldlm_cancel_lru(ldlm_pl2ns(pl), 0, LCF_ASYNC, LDLM_CANCEL_LRUR);
+
+out:
+ spin_lock(&pl->pl_lock);
+ /*
+ * Time of LRU resizing might be longer than period,
+ * so update after LRU resizing rather than before it.
+ */
+ pl->pl_recalc_time = get_seconds();
+ lprocfs_counter_add(pl->pl_stats, LDLM_POOL_TIMING_STAT,
+ recalc_interval_sec);
+ spin_unlock(&pl->pl_lock);
+ return ret;
}
/**
@@ -591,6 +603,14 @@ int ldlm_pool_recalc(struct ldlm_pool *pl)
}
recalc_interval_sec = pl->pl_recalc_time - get_seconds() +
pl->pl_recalc_period;
+ if (recalc_interval_sec <= 0) {
+ /* Prevent too frequent recalculation. */
+ CDEBUG(D_DLMTRACE, "Negative interval(%ld), "
+ "too short period(%ld)",
+ recalc_interval_sec,
+ pl->pl_recalc_period);
+ recalc_interval_sec = 1;
+ }
return recalc_interval_sec;
}
--
2.1.0
From: Li Xi <[email protected]>
After failure happens and put_pages_back() returns the pages,
tracefiled() should not go on itering on the page list.
Otherwise, some pages might be accessed illegally.
Signed-off-by: Li Xi <[email protected]>
Signed-off-by: Jian Yu <[email protected]>
Reviewed-on: http://review.whamcloud.com/11454
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5126
Reviewed-by: Emoly Liu <[email protected]>
Reviewed-by: Niu Yawei <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/libcfs/tracefile.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/lustre/lustre/libcfs/tracefile.c b/drivers/staging/lustre/lustre/libcfs/tracefile.c
index d18de04..eb65b50 100644
--- a/drivers/staging/lustre/lustre/libcfs/tracefile.c
+++ b/drivers/staging/lustre/lustre/libcfs/tracefile.c
@@ -1037,6 +1037,7 @@ static int tracefiled(void *arg)
tage->used, rc);
put_pages_back(&pc);
__LASSERT(list_empty(&pc.pc_pages));
+ break;
}
}
MMSPACE_CLOSE;
--
2.1.0
From: Hongchao Zhang <[email protected]>
in "class_export_recovery_cleanup", the check of the flag
"exp->exp_req_replay_needed" should be protected by "exp_lock".
Signed-off-by: Hongchao Zhang <[email protected]>
Reviewed-on: http://review.whamcloud.com/10849
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5128
Reviewed-by: Fan Yong <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/obdclass/genops.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index 736ca41..8250821 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -1151,22 +1151,24 @@ void class_export_recovery_cleanup(struct obd_export *exp)
exp->exp_obd->obd_stale_clients++;
}
spin_unlock(&obd->obd_recovery_task_lock);
+
+ spin_lock(&exp->exp_lock);
/** Cleanup req replay fields */
if (exp->exp_req_replay_needed) {
- spin_lock(&exp->exp_lock);
exp->exp_req_replay_needed = 0;
- spin_unlock(&exp->exp_lock);
+
LASSERT(atomic_read(&obd->obd_req_replay_clients));
atomic_dec(&obd->obd_req_replay_clients);
}
+
/** Cleanup lock replay data */
if (exp->exp_lock_replay_needed) {
- spin_lock(&exp->exp_lock);
exp->exp_lock_replay_needed = 0;
- spin_unlock(&exp->exp_lock);
+
LASSERT(atomic_read(&obd->obd_lock_replay_clients));
atomic_dec(&obd->obd_lock_replay_clients);
}
+ spin_unlock(&exp->exp_lock);
}
/* This function removes 1-3 references from the export:
--
2.1.0
From: Dmitry Eremin <[email protected]>
Expression if (size != (ssize_t)size) is always false.
Therefore no bounds check errors detected.
Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/11376
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5417
Reviewed-by: John L. Hammond <[email protected]>
Reviewed-by: Fan Yong <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/include/lustre/lustre_user.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 89794fd..176bd46 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -1012,7 +1012,7 @@ static inline ssize_t hur_len(struct hsm_user_request *hur)
(__u64)hur->hur_request.hr_itemcount *
sizeof(hur->hur_user_item[0]) + hur->hur_request.hr_data_len;
- if (size != (ssize_t)size)
+ if (size > ~((size_t)0)>>1)
return -1;
return size;
--
2.1.0
From: Liang Zhen <[email protected]>
A couple of changes to improve aliveness detection:
- When LNet received a message, it can determine peer of this message
is alive
- When LNet received a message from remote network, it can determine
router is alive and NI status on router is UP.
Signed-off-by: Liang Zhen <[email protected]>
Reviewed-on: http://review.whamcloud.com/12453
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5485
Reviewed-by: James Simmons <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/include/linux/lnet/lib-lnet.h | 10 ++++++++++
drivers/staging/lustre/lnet/lnet/lib-move.c | 13 +++++++++++++
drivers/staging/lustre/lnet/lnet/router.c | 17 ++++++++++++++++-
3 files changed, 39 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
index 99fb52a..0038d29 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-lnet.h
@@ -636,6 +636,7 @@ lnet_net2rnethash(__u32 net)
}
extern lnd_t the_lolnd;
+extern int avoid_asym_router_failure;
int lnet_cpt_of_nid_locked(lnet_nid_t nid);
int lnet_cpt_of_nid(lnet_nid_t nid);
@@ -851,6 +852,7 @@ int lnet_peer_buffer_credits(lnet_ni_t *ni);
int lnet_router_checker_start(void);
void lnet_router_checker_stop(void);
+void lnet_router_ni_update_locked(lnet_peer_t *gw, __u32 net);
void lnet_swap_pinginfo(lnet_ping_info_t *info);
int lnet_ping_target_init(void);
@@ -870,4 +872,12 @@ void lnet_peer_tables_destroy(void);
int lnet_peer_tables_create(void);
void lnet_debug_peer(lnet_nid_t nid);
+static inline void lnet_peer_set_alive(lnet_peer_t *lp)
+{
+ lp->lp_last_alive = lp->lp_last_query = get_seconds();
+ if (!lp->lp_alive)
+ lnet_notify_locked(lp, 0, 1, lp->lp_last_alive);
+}
+
+
#endif
diff --git a/drivers/staging/lustre/lnet/lnet/lib-move.c b/drivers/staging/lustre/lnet/lnet/lib-move.c
index ed6eec9..0f53c76 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-move.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-move.c
@@ -1877,6 +1877,19 @@ lnet_parse(lnet_ni_t *ni, lnet_hdr_t *hdr, lnet_nid_t from_nid,
goto drop;
}
+ if (lnet_isrouter(msg->msg_rxpeer)) {
+ lnet_peer_set_alive(msg->msg_rxpeer);
+ if (avoid_asym_router_failure &&
+ LNET_NIDNET(src_nid) != LNET_NIDNET(from_nid)) {
+ /* received a remote message from router, update
+ * remote NI status on this router.
+ * NB: multi-hop routed message will be ignored.
+ */
+ lnet_router_ni_update_locked(msg->msg_rxpeer,
+ LNET_NIDNET(src_nid));
+ }
+ }
+
lnet_msg_commit(msg, cpt);
if (!for_me) {
diff --git a/drivers/staging/lustre/lnet/lnet/router.c b/drivers/staging/lustre/lnet/lnet/router.c
index 1bbaa5b..52ec0ab 100644
--- a/drivers/staging/lustre/lnet/lnet/router.c
+++ b/drivers/staging/lustre/lnet/lnet/router.c
@@ -84,7 +84,7 @@ static int check_routers_before_use;
module_param(check_routers_before_use, int, 0444);
MODULE_PARM_DESC(check_routers_before_use, "Assume routers are down and ping them before use");
-static int avoid_asym_router_failure = 1;
+int avoid_asym_router_failure = 1;
module_param(avoid_asym_router_failure, int, 0644);
MODULE_PARM_DESC(avoid_asym_router_failure, "Avoid asymmetrical router failures (0 to disable)");
@@ -783,6 +783,21 @@ lnet_wait_known_routerstate(void)
}
}
+void
+lnet_router_ni_update_locked(lnet_peer_t *gw, __u32 net)
+{
+ lnet_route_t *rte;
+
+ if ((gw->lp_ping_feats & LNET_PING_FEAT_NI_STATUS) != 0) {
+ list_for_each_entry(rte, &gw->lp_routes, lr_gwlist) {
+ if (rte->lr_net == net) {
+ rte->lr_downis = 0;
+ break;
+ }
+ }
+ }
+}
+
static void
lnet_update_ni_status_locked(void)
{
--
2.1.0
From: Jinshan Xiong <[email protected]>
If there exists MGS conf_param to reduce the memory cache
max_cached_mb it will fail because dt_exp is not initialized
yet.
It should just go ahead and configure it because certainly it
have enough free LRU slot to deduct ccc_lru_left.
Signed-off-by: Jinshan Xiong <[email protected]>
Reviewed-on: http://review.whamcloud.com/11783
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3676
Reviewed-by: Bobi Jam <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/llite/lproc_llite.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index e6a909e..aaa13bd 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -399,9 +399,6 @@ static ssize_t ll_max_cached_mb_seq_write(struct file *file,
return -ERANGE;
}
- if (sbi->ll_dt_exp == NULL)
- return -ENODEV;
-
spin_lock(&sbi->ll_lock);
diff = pages_number - cache->ccc_lru_max;
spin_unlock(&sbi->ll_lock);
@@ -437,6 +434,11 @@ static ssize_t ll_max_cached_mb_seq_write(struct file *file,
if (diff <= 0)
break;
+ if (sbi->ll_dt_exp == NULL) { /* being initialized */
+ rc = -ENODEV;
+ break;
+ }
+
/* difficult - have to ask OSCs to drop LRU slots. */
tmp = diff << 1;
rc = obd_set_info_async(NULL, sbi->ll_dt_exp,
--
2.1.0
From: "John L. Hammond" <[email protected]>
In init_lustre_lite() ensure that /proc/fs/lustre/llite is removed in
case of failure. Generally rework the cleanup code in this function.
Signed-off-by: John L. Hammond <[email protected]>
Reviewed-on: http://review.whamcloud.com/6420
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3331
Reviewed-by: Bob Glossman <[email protected]>
Reviewed-by: James Simmons <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/llite/super25.c | 141 +++++++++++++-------------
1 file changed, 72 insertions(+), 69 deletions(-)
diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
index 6aff155..7c1e02a 100644
--- a/drivers/staging/lustre/lustre/llite/super25.c
+++ b/drivers/staging/lustre/lustre/llite/super25.c
@@ -72,21 +72,6 @@ static void ll_destroy_inode(struct inode *inode)
call_rcu(&inode->i_rcu, ll_inode_destroy_callback);
}
-static int ll_init_inodecache(void)
-{
- ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
- sizeof(struct ll_inode_info),
- 0, SLAB_HWCACHE_ALIGN, NULL);
- if (ll_inode_cachep == NULL)
- return -ENOMEM;
- return 0;
-}
-
-static void ll_destroy_inodecache(void)
-{
- kmem_cache_destroy(ll_inode_cachep);
-}
-
/* exported operations */
struct super_operations lustre_super_operations = {
.alloc_inode = ll_alloc_inode,
@@ -104,9 +89,10 @@ void lustre_register_client_process_config(int (*cpc)(struct lustre_cfg *lcfg));
static int __init init_lustre_lite(void)
{
- int i, rc, seed[2];
- struct timeval tv;
+ struct proc_dir_entry *entry;
lnet_process_id_t lnet_id;
+ struct timeval tv;
+ int i, rc, seed[2];
CLASSERT(sizeof(LUSTRE_VOLATILE_HDR) == LUSTRE_VOLATILE_HDR_LEN + 1);
@@ -116,59 +102,52 @@ static int __init init_lustre_lite(void)
CDEBUG(D_INFO, "Lustre client module (%p).\n",
&lustre_super_operations);
- rc = ll_init_inodecache();
- if (rc)
- return -ENOMEM;
+ rc = -ENOMEM;
+ ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
+ sizeof(struct ll_inode_info),
+ 0, SLAB_HWCACHE_ALIGN, NULL);
+ if (ll_inode_cachep == NULL)
+ goto out_cache;
+
ll_file_data_slab = kmem_cache_create("ll_file_data",
sizeof(struct ll_file_data), 0,
SLAB_HWCACHE_ALIGN, NULL);
- if (ll_file_data_slab == NULL) {
- ll_destroy_inodecache();
- return -ENOMEM;
- }
+ if (ll_file_data_slab == NULL)
+ goto out_cache;
ll_remote_perm_cachep = kmem_cache_create("ll_remote_perm_cache",
sizeof(struct ll_remote_perm),
0, 0, NULL);
- if (ll_remote_perm_cachep == NULL) {
- kmem_cache_destroy(ll_file_data_slab);
- ll_file_data_slab = NULL;
- ll_destroy_inodecache();
- return -ENOMEM;
- }
+ if (ll_remote_perm_cachep == NULL)
+ goto out_cache;
ll_rmtperm_hash_cachep = kmem_cache_create("ll_rmtperm_hash_cache",
REMOTE_PERM_HASHSIZE *
sizeof(struct list_head),
0, 0, NULL);
- if (ll_rmtperm_hash_cachep == NULL) {
- kmem_cache_destroy(ll_remote_perm_cachep);
- ll_remote_perm_cachep = NULL;
- kmem_cache_destroy(ll_file_data_slab);
- ll_file_data_slab = NULL;
- ll_destroy_inodecache();
- return -ENOMEM;
+ if (ll_rmtperm_hash_cachep == NULL)
+ goto out_cache;
+
+ entry = lprocfs_register("llite", proc_lustre_root, NULL, NULL);
+ if (IS_ERR(entry)) {
+ rc = PTR_ERR(entry);
+ CERROR("cannot register '/proc/fs/lustre/llite': rc = %d\n",
+ rc);
+ goto out_cache;
}
- proc_lustre_fs_root = proc_lustre_root ?
- lprocfs_register("llite", proc_lustre_root, NULL, NULL) : NULL;
-
- lustre_register_client_fill_super(ll_fill_super);
- lustre_register_kill_super_cb(ll_kill_super);
-
- lustre_register_client_process_config(ll_process_config);
+ proc_lustre_fs_root = entry;
cfs_get_random_bytes(seed, sizeof(seed));
- /* Nodes with small feet have little entropy
- * the NID for this node gives the most entropy in the low bits */
- for (i = 0; ; i++) {
- if (LNetGetId(i, &lnet_id) == -ENOENT) {
+ /* Nodes with small feet have little entropy. The NID for this
+ * node gives the most entropy in the low bits */
+ for (i = 0;; i++) {
+ if (LNetGetId(i, &lnet_id) == -ENOENT)
break;
- }
- if (LNET_NETTYP(LNET_NIDNET(lnet_id.nid)) != LOLND) {
+
+ if (LNET_NETTYP(LNET_NIDNET(lnet_id.nid)) != LOLND)
seed[0] ^= LNET_NIDADDR(lnet_id.nid);
- }
}
do_gettimeofday(&tv);
@@ -177,20 +156,54 @@ static int __init init_lustre_lite(void)
init_timer(&ll_capa_timer);
ll_capa_timer.function = ll_capa_timer_callback;
rc = ll_capa_thread_start();
- /*
- * XXX normal cleanup is needed here.
- */
- if (rc == 0)
- rc = vvp_global_init();
+ if (rc != 0)
+ goto out_proc;
- if (rc == 0)
- rc = ll_xattr_init();
+ rc = vvp_global_init();
+ if (rc != 0)
+ goto out_capa;
+
+ rc = ll_xattr_init();
+ if (rc != 0)
+ goto out_vvp;
+
+ lustre_register_client_fill_super(ll_fill_super);
+ lustre_register_kill_super_cb(ll_kill_super);
+ lustre_register_client_process_config(ll_process_config);
+
+ return 0;
+
+out_vvp:
+ vvp_global_fini();
+out_capa:
+ del_timer(&ll_capa_timer);
+ ll_capa_thread_stop();
+out_proc:
+ lprocfs_remove(&proc_lustre_fs_root);
+out_cache:
+ if (ll_inode_cachep != NULL)
+ kmem_cache_destroy(ll_inode_cachep);
+
+ if (ll_file_data_slab != NULL)
+ kmem_cache_destroy(ll_file_data_slab);
+
+ if (ll_remote_perm_cachep != NULL)
+ kmem_cache_destroy(ll_remote_perm_cachep);
+
+ if (ll_rmtperm_hash_cachep != NULL)
+ kmem_cache_destroy(ll_rmtperm_hash_cachep);
return rc;
}
static void __exit exit_lustre_lite(void)
{
+ lustre_register_client_fill_super(NULL);
+ lustre_register_kill_super_cb(NULL);
+ lustre_register_client_process_config(NULL);
+
+ lprocfs_remove(&proc_lustre_fs_root);
+
ll_xattr_fini();
vvp_global_fini();
del_timer(&ll_capa_timer);
@@ -199,22 +212,12 @@ static void __exit exit_lustre_lite(void)
"client remaining capa count %d\n",
capa_count[CAPA_SITE_CLIENT]);
- lustre_register_client_fill_super(NULL);
- lustre_register_kill_super_cb(NULL);
-
- lustre_register_client_process_config(NULL);
-
- ll_destroy_inodecache();
-
+ kmem_cache_destroy(ll_inode_cachep);
kmem_cache_destroy(ll_rmtperm_hash_cachep);
- ll_rmtperm_hash_cachep = NULL;
kmem_cache_destroy(ll_remote_perm_cachep);
- ll_remote_perm_cachep = NULL;
kmem_cache_destroy(ll_file_data_slab);
- if (proc_lustre_fs_root && !IS_ERR(proc_lustre_fs_root))
- lprocfs_remove(&proc_lustre_fs_root);
}
MODULE_AUTHOR("Sun Microsystems, Inc. <http://www.lustre.org/>");
--
2.1.0
From: Henri Doreau <[email protected]>
A variable-length structure preceeds llog_rec_tail within an llog
block. Thus cr_tail shouldn't be accessed directly as a structure
member but its actual location should be computed dynamically.
Signed-off-by: Henri Doreau <[email protected]>
Reviewed-on: http://review.whamcloud.com/11937
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5631
Reviewed-by: jacques-Charles Lafoucriere <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/obdclass/llog_swab.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/lustre/obdclass/llog_swab.c b/drivers/staging/lustre/lustre/obdclass/llog_swab.c
index d3ec90e..a2d5aa1 100644
--- a/drivers/staging/lustre/lustre/obdclass/llog_swab.c
+++ b/drivers/staging/lustre/lustre/obdclass/llog_swab.c
@@ -168,7 +168,8 @@ void lustre_swab_llog_rec(struct llog_rec_hdr *rec)
}
case CHANGELOG_REC:
{
- struct llog_changelog_rec *cr = (struct llog_changelog_rec *)rec;
+ struct llog_changelog_rec *cr =
+ (struct llog_changelog_rec *)rec;
__swab16s(&cr->cr.cr_namelen);
__swab16s(&cr->cr.cr_flags);
@@ -188,6 +189,8 @@ void lustre_swab_llog_rec(struct llog_rec_hdr *rec)
} else {
tail = &cr->cr_tail;
}
+ tail = (struct llog_rec_tail *)((char *)tail +
+ cr->cr.cr_namelen);
break;
}
case CHANGELOG_USER_REC:
--
2.1.0
From: Liang Zhen <[email protected]>
Portal spreading rotor should be unsigned, otherwise lnet may get
negative CPT number and access invalid addresses.
Signed-off-by: Liang Zhen <[email protected]>
Reviewed-on: http://review.whamcloud.com/11936
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5639
Reviewed-by: Amir Shehata <[email protected]>
Reviewed-by: Isaac Huang <[email protected]>
Reviewed-by: Doug Oucharek <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/include/linux/lnet/lib-types.h | 2 +-
drivers/staging/lustre/lnet/lnet/lib-ptl.c | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/lustre/include/linux/lnet/lib-types.h b/drivers/staging/lustre/include/linux/lnet/lib-types.h
index ba1876f..50537668 100644
--- a/drivers/staging/lustre/include/linux/lnet/lib-types.h
+++ b/drivers/staging/lustre/include/linux/lnet/lib-types.h
@@ -622,7 +622,7 @@ typedef struct lnet_portal {
/* Match table for each CPT */
struct lnet_match_table **ptl_mtables;
/* spread rotor of incoming "PUT" */
- int ptl_rotor;
+ unsigned int ptl_rotor;
/* # active entries for this portal */
int ptl_mt_nmaps;
/* array of active entries' cpu-partition-id */
diff --git a/drivers/staging/lustre/lnet/lnet/lib-ptl.c b/drivers/staging/lustre/lnet/lnet/lib-ptl.c
index 19ed696..3ba0da9 100644
--- a/drivers/staging/lustre/lnet/lnet/lib-ptl.c
+++ b/drivers/staging/lustre/lnet/lnet/lib-ptl.c
@@ -262,10 +262,10 @@ lnet_mt_of_match(struct lnet_match_info *info, struct lnet_msg *msg)
{
struct lnet_match_table *mtable;
struct lnet_portal *ptl;
- int nmaps;
- int rotor;
- int routed;
- int cpt;
+ unsigned int nmaps;
+ unsigned int rotor;
+ unsigned int cpt;
+ bool routed;
/* NB: called w/o lock */
LASSERT(info->mi_portal < the_lnet.ln_nportals);
--
2.1.0
From: Dmitry Eremin <[email protected]>
Change type of cl_conn_count to size_t.
Signed-off-by: Dmitry Eremin <[email protected]>
Reviewed-on: http://review.whamcloud.com/13125
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5577
Reviewed-by: James Simmons <[email protected]>
Reviewed-by: John L. Hammond <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/include/obd.h | 2 +-
drivers/staging/lustre/lustre/ldlm/ldlm_lib.c | 6 +++---
drivers/staging/lustre/lustre/mgc/mgc_request.c | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 2a88b80..7cbc378 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -312,7 +312,7 @@ struct client_obd {
struct rw_semaphore cl_sem;
struct obd_uuid cl_target_uuid;
struct obd_import *cl_import; /* ptlrpc connection state */
- int cl_conn_count;
+ size_t cl_conn_count;
/* max_mds_easize is purely a performance thing so we don't have to
* call obd_size_diskmd() all the time. */
int cl_default_mds_easize;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index c5c86e7..e1586ef 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -571,10 +571,10 @@ int client_disconnect_export(struct obd_export *exp)
imp = cli->cl_import;
down_write(&cli->cl_sem);
- CDEBUG(D_INFO, "disconnect %s - %d\n", obd->obd_name,
+ CDEBUG(D_INFO, "disconnect %s - %zu\n", obd->obd_name,
cli->cl_conn_count);
- if (!cli->cl_conn_count) {
+ if (cli->cl_conn_count == 0) {
CERROR("disconnecting disconnected device (%s)\n",
obd->obd_name);
rc = -EINVAL;
@@ -582,7 +582,7 @@ int client_disconnect_export(struct obd_export *exp)
}
cli->cl_conn_count--;
- if (cli->cl_conn_count) {
+ if (cli->cl_conn_count != 0) {
rc = 0;
goto out_disconnect;
}
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 60d2b0f..5124112 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -790,8 +790,8 @@ static int mgc_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
}
/* Make sure not to re-enqueue when the mgc is stopping
(we get called from client_disconnect_export) */
- if (!lock->l_conn_export ||
- !lock->l_conn_export->exp_obd->u.cli.cl_conn_count) {
+ if (lock->l_conn_export == NULL ||
+ lock->l_conn_export->exp_obd->u.cli.cl_conn_count == 0) {
CDEBUG(D_MGC, "log %.8s: disconnecting, won't requeue\n",
cld->cld_logname);
config_log_put(cld);
--
2.1.0
From: Niu Yawei <[email protected]>
In after_reply(), take the rq_lock for changing the rq_resend.
Signed-off-by: Niu Yawei <[email protected]>
Reviewed-on: http://review.whamcloud.com/11957
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5633
Reviewed-by: Fan Yong <[email protected]>
Reviewed-by: Johann Lombardi <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/ptlrpc/client.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 8c1ec83..4882dd0 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1247,7 +1247,9 @@ static int after_reply(struct ptlrpc_request *req)
time_t now = get_seconds();
DEBUG_REQ(D_RPCTRACE, req, "Resending request on EINPROGRESS");
+ spin_lock(&req->rq_lock);
req->rq_resend = 1;
+ spin_unlock(&req->rq_lock);
req->rq_nr_resend++;
/* allocate new xid to avoid reply reconstruction */
--
2.1.0
From: Jinshan Xiong <[email protected]>
In vvp_io_read_lock(), it used to decide if to add read lock by
checking lli_has_smd. Accessing lli_has_smd is racy when an empty
file is turned into raid0, therefore, it may result in read requests
are issued without corresponding lock.
Signed-off-by: Jinshan Xiong <[email protected]>
Reviewed-on: http://review.whamcloud.com/12139
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5062
Reviewed-by: Bobi Jam <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/include/lclient.h | 1 +
drivers/staging/lustre/lustre/lclient/lcommon_cl.c | 6 ++++++
drivers/staging/lustre/lustre/llite/vvp_io.c | 15 +++++----------
drivers/staging/lustre/lustre/llite/vvp_lock.c | 1 +
4 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/staging/lustre/lustre/include/lclient.h b/drivers/staging/lustre/lustre/include/lclient.h
index 316500c..c5c3a8d 100644
--- a/drivers/staging/lustre/lustre/include/lclient.h
+++ b/drivers/staging/lustre/lustre/include/lclient.h
@@ -325,6 +325,7 @@ void ccc_lock_fini(const struct lu_env *env, struct cl_lock_slice *slice);
int ccc_lock_enqueue(const struct lu_env *env,
const struct cl_lock_slice *slice,
struct cl_io *io, __u32 enqflags);
+int ccc_lock_use(const struct lu_env *env, const struct cl_lock_slice *slice);
int ccc_lock_unuse(const struct lu_env *env, const struct cl_lock_slice *slice);
int ccc_lock_wait(const struct lu_env *env, const struct cl_lock_slice *slice);
int ccc_lock_fits_into(const struct lu_env *env,
diff --git a/drivers/staging/lustre/lustre/lclient/lcommon_cl.c b/drivers/staging/lustre/lustre/lclient/lcommon_cl.c
index 24d26ab..23095bb 100644
--- a/drivers/staging/lustre/lustre/lclient/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/lclient/lcommon_cl.c
@@ -586,6 +586,12 @@ int ccc_lock_enqueue(const struct lu_env *env,
return 0;
}
+int ccc_lock_use(const struct lu_env *env, const struct cl_lock_slice *slice)
+{
+ CLOBINVRNT(env, slice->cls_obj, ccc_object_invariant(slice->cls_obj));
+ return 0;
+}
+
int ccc_lock_unuse(const struct lu_env *env, const struct cl_lock_slice *slice)
{
CLOBINVRNT(env, slice->cls_obj, ccc_object_invariant(slice->cls_obj));
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 930f601..ce54b76 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -307,18 +307,13 @@ static int vvp_io_rw_lock(const struct lu_env *env, struct cl_io *io,
static int vvp_io_read_lock(const struct lu_env *env,
const struct cl_io_slice *ios)
{
- struct cl_io *io = ios->cis_io;
- struct ll_inode_info *lli = ll_i2info(ccc_object_inode(io->ci_obj));
+ struct cl_io *io = ios->cis_io;
+ struct cl_io_rw_common *rd = &io->u.ci_rd.rd;
int result;
- /* XXX: Layer violation, we shouldn't see lsm at llite level. */
- if (lli->lli_has_smd) /* lsm-less file doesn't need to lock */
- result = vvp_io_rw_lock(env, io, CLM_READ,
- io->u.ci_rd.rd.crw_pos,
- io->u.ci_rd.rd.crw_pos +
- io->u.ci_rd.rd.crw_count - 1);
- else
- result = 0;
+ result = vvp_io_rw_lock(env, io, CLM_READ, rd->crw_pos,
+ rd->crw_pos + rd->crw_count - 1);
+
return result;
}
diff --git a/drivers/staging/lustre/lustre/llite/vvp_lock.c b/drivers/staging/lustre/lustre/llite/vvp_lock.c
index 372633e..f354e82 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_lock.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_lock.c
@@ -71,6 +71,7 @@ static const struct cl_lock_operations vvp_lock_ops = {
.clo_fini = ccc_lock_fini,
.clo_enqueue = ccc_lock_enqueue,
.clo_wait = ccc_lock_wait,
+ .clo_use = ccc_lock_use,
.clo_unuse = ccc_lock_unuse,
.clo_fits_into = ccc_lock_fits_into,
.clo_state = ccc_lock_state,
--
2.1.0
From: wang di <[email protected]>
It is possible that when fld client is trying to lookup seq
on one of MDT, but the connection between the client and the MDT
is not being initialized yet, especially during striped dir creation,
because client will only send create req to the master MDT, then
master MDT will distribute the operation to all of other MDT, instead
of client distributing these requests, which will usually trigger
the connection.
In this case, we will send the fld request to MDT0, since it has
all of location information.
Signed-off-by: wang di <[email protected]>
Reviewed-on: http://review.whamcloud.com/11780
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4855
Reviewed-by: Fan Yong <[email protected]>
Reviewed-by: Mike Pershin <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/fld/fld_request.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/staging/lustre/lustre/fld/fld_request.c b/drivers/staging/lustre/lustre/fld/fld_request.c
index 7801db0..b8d17e1 100644
--- a/drivers/staging/lustre/lustre/fld/fld_request.c
+++ b/drivers/staging/lustre/lustre/fld/fld_request.c
@@ -131,11 +131,20 @@ fld_rrb_scan(struct lu_client_fld *fld, u64 seq)
else
hash = 0;
+again:
list_for_each_entry(target, &fld->lcf_targets, ft_chain) {
if (target->ft_idx == hash)
return target;
}
+ if (hash != 0) {
+ /* It is possible the remote target(MDT) are not connected to
+ * with client yet, so we will refer this to MDT0, which should
+ * be connected during mount */
+ hash = 0;
+ goto again;
+ }
+
CERROR("%s: Can't find target by hash %d (seq %#llx). Targets (%d):\n",
fld->lcf_name, hash, seq, fld->lcf_count);
--
2.1.0
From: Frank Zago <[email protected]>
Since reg->kr_fp can be changed inside the foreach loop,
kkuc_groups must be write protected, and not just read protected.
This should fix the following oops, which could happen if two different
threads simultaneously execute the function, and EPIPE is returned.
PID: 24385 TASK: ffff88012da5f500 CPU: 1 COMMAND: "ldlm_cb00_056"
#0 [ffff88012db55810] machine_kexec at ffffffff81038f3b
#1 [ffff88012db55870] crash_kexec at ffffffff810c59f2
#2 [ffff88012db55940] oops_end at ffffffff8152b7f0
#3 [ffff88012db55970] no_context at ffffffff8104a00b
#4 [ffff88012db559c0] __bad_area_nosemaphore at ffffffff8104a295
#5 [ffff88012db55a10] bad_area_nosemaphore at ffffffff8104a363
#6 [ffff88012db55a20] __do_page_fault at ffffffff8104aabf
#7 [ffff88012db55b40] do_page_fault at ffffffff8152d73e
#8 [ffff88012db55b70] page_fault at ffffffff8152aaf5
[exception RIP: fput+9]
RIP: ffffffff8118a509 RSP: ffff88012db55c20 RFLAGS: 00010246
RAX: 00000000ffffffe0 RBX: ffff8800a8ea4fc0 RCX: 0000000000000000
RDX: ffffffffa03c9eb0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88012db55c20 R8: 00000000ffffff0a R9: 00000000fffffffc
R10: 0000000000000001 R11: 282064656c696166 R12: ffffffffa03c9c60
R13: ffff88005df240f8 R14: 0000000000000000 R15: ffff88013b4ca000
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#9 [ffff88012db55c28] libcfs_kkuc_group_put at ffffffffa0388044 [libcfs]
[ptlrpc]
Signed-off-by: frank zago <[email protected]>
Reviewed-on: http://review.whamcloud.com/11355
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5458
Reviewed-by: Patrick Farrell <[email protected]>
Reviewed-by: James Simmons <[email protected]>
Reviewed-by: Dmitry Eremin <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/libcfs/kernel_user_comm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/lustre/lustre/libcfs/kernel_user_comm.c b/drivers/staging/lustre/lustre/libcfs/kernel_user_comm.c
index e2aa637..d9b7c6b 100644
--- a/drivers/staging/lustre/lustre/libcfs/kernel_user_comm.c
+++ b/drivers/staging/lustre/lustre/libcfs/kernel_user_comm.c
@@ -228,12 +228,12 @@ int libcfs_kkuc_group_foreach(int group, libcfs_kkuc_cb_t cb_func,
if (kkuc_groups[group].next == NULL)
return 0;
- down_read(&kg_sem);
+ down_write(&kg_sem);
list_for_each_entry(reg, &kkuc_groups[group], kr_chain) {
if (reg->kr_fp != NULL)
rc = cb_func(reg->kr_data, cb_arg);
}
- up_read(&kg_sem);
+ up_write(&kg_sem);
return rc;
}
--
2.1.0
From: Yang Sheng <[email protected]>
We need to check for an exception entry after radix_tree lookup.
Signed-off-by: Yang Sheng <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
Reviewed-on: http://review.whamcloud.com/10709
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5162
Reviewed-by: Bob Glossman <[email protected]>
Reviewed-by: James Simmons <[email protected]>
---
drivers/staging/lustre/lustre/llite/dir.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index babba60..69a4a63 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -278,7 +278,7 @@ static struct page *ll_dir_page_locate(struct inode *dir, __u64 *hash,
spin_lock_irq(&mapping->tree_lock);
found = radix_tree_gang_lookup(&mapping->page_tree,
(void **)&page, offset, 1);
- if (found > 0) {
+ if (found > 0 && !radix_tree_exceptional_entry(page)) {
struct lu_dirpage *dp;
page_cache_get(page);
--
2.1.0
From: Lai Siyao <[email protected]>
Reading directory pages may fail on MDS, in this case client should
not cache a non-up-to-date directory page, because it will cause
a later read on the same page fail.
Signed-off-by: Lai Siyao <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5461
Reviewed-on: http://review.whamcloud.com/11450
---
drivers/staging/lustre/lustre/llite/dir.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 69a4a63..a182019 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -183,7 +183,10 @@ static int ll_dir_filler(void *_hash, struct page *page0)
op_data->op_offset = hash;
rc = md_readpage(exp, op_data, page_pool, &request);
ll_finish_md_op_data(op_data);
- if (rc == 0) {
+ if (rc < 0) {
+ /* page0 is special, which was added into page cache early */
+ delete_from_page_cache(page0);
+ } else if (rc == 0) {
body = req_capsule_server_get(&request->rq_pill, &RMF_MDT_BODY);
/* Checked by mdc_readpage() */
LASSERT(body != NULL);
--
2.1.0
From: Patrick Farrell <[email protected]>
When a group lock with GID=0 is released (put_grouplock is
called), an assertion in cl_put_grouplock is hit.
We should not allow group lock requests with GID=0, instead
we should return -EINVAL.
Signed-off-by: Patrick Farrell <[email protected]>
Reviewed-on: http://review.whamcloud.com/12459
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5817
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: frank zago <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
---
drivers/staging/lustre/lustre/llite/file.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index ca270f4..7c7ef7e 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1553,6 +1553,11 @@ ll_get_grouplock(struct inode *inode, struct file *file, unsigned long arg)
struct ccc_grouplock grouplock;
int rc;
+ if (arg == 0) {
+ CWARN("group id for group lock must not be 0\n");
+ return -EINVAL;
+ }
+
if (ll_file_nolock(file))
return -EOPNOTSUPP;
--
2.1.0
From: Oleg Drokin <[email protected]>
Commit ab909585b813 ("staging: lustre: Cleanup variable declarations
in mdc_enqueue()") broke Lustre flock handling introducing access
to uninitialized req variable, leading to bizzare crash in a later
call to __req_capsule_offset with invalid pill value.
Set req to NULL just for this case as in all other cases req is
explicitly initialized with request packing call.
Signed-off-by: Oleg Drokin <[email protected]>
CC: Srikrishan Malik <[email protected]>
---
drivers/staging/lustre/lustre/mdc/mdc_locks.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 8c9b4f5..d1c224e 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -828,6 +828,7 @@ resend:
einfo->ei_type);
policy = (ldlm_policy_data_t *)lmm;
res_id.name[3] = LDLM_FLOCK;
+ req = NULL;
} else if (it->it_op & IT_OPEN) {
req = mdc_intent_open_pack(exp, it, op_data, lmm, lmmsize,
einfo->ei_cbdata);
--
2.1.0
On Sun, Feb 01, 2015 at 09:52:05PM -0500, [email protected] wrote:
> From: Dmitry Eremin <[email protected]>
>
> Expression if (size != (ssize_t)size) is always false.
> Therefore no bounds check errors detected.
The original code actually worked as designed. The integer overflow
could only happen on 32 bit systems and the test only was true for 32
bit systems.
> - if (size != (ssize_t)size)
> + if (size > ~((size_t)0)>>1)
> return -1;
The problem is that the code was unclear. I think the new code is even
more complicated to look at.
regards,
dan carpenter
On Mon, Feb 02, 2015 at 04:02:31PM +0300, Dan Carpenter wrote:
> On Sun, Feb 01, 2015 at 09:52:05PM -0500, [email protected] wrote:
> > From: Dmitry Eremin <[email protected]>
> >
> > Expression if (size != (ssize_t)size) is always false.
> > Therefore no bounds check errors detected.
>
> The original code actually worked as designed. The integer overflow
> could only happen on 32 bit systems and the test only was true for 32
> bit systems.
>
> > - if (size != (ssize_t)size)
> > + if (size > ~((size_t)0)>>1)
> > return -1;
>
> The problem is that the code was unclear. I think the new code is even
> more complicated to look at.
I agree, I don't even understand what the new code is doing.
What is this code supposed to be protecting from? And -1? That should
never be a return value...
thanks,
greg k-h
Hello!
On Feb 2, 2015, at 10:44 AM, Greg Kroah-Hartman wrote:
> On Mon, Feb 02, 2015 at 04:02:31PM +0300, Dan Carpenter wrote:
>> On Sun, Feb 01, 2015 at 09:52:05PM -0500, [email protected] wrote:
>>> From: Dmitry Eremin <[email protected]>
>>>
>>> Expression if (size != (ssize_t)size) is always false.
>>> Therefore no bounds check errors detected.
>>
>> The original code actually worked as designed. The integer overflow
>> could only happen on 32 bit systems and the test only was true for 32
>> bit systems.
Hm, indeed.
Originally I fell into the trap thinking we are trying to protect against
negative results here too. But in fact callers all check for the return
to be negative as an error sign. Not to mention that we cannot overflow
64bit integer here as explained by the comment just 2 lines above the
default patch context.
>>
>>> - if (size != (ssize_t)size)
>>> + if (size > ~((size_t)0)>>1)
>>> return -1;
>>
>> The problem is that the code was unclear. I think the new code is even
>> more complicated to look at.
> I agree, I don't even understand what the new code is doing.
Sorry, this patch indeed should be dropped.
> What is this code supposed to be protecting from? And -1? That should
> never be a return value?
Why is -1 a bad return value if all callsites check for that as an
indication of error?
(granted there's only one caller at this point in kernel space:
lustre/llite/dir.c::ll_dir_ioctl()
totalsize = hur_len(hur);
OBD_FREE_PTR(hur);
if (totalsize < 0)
return -E2BIG;
)
Bye,
Oleg-
On Mon, Feb 02, 2015 at 03:25:58PM -0500, Oleg Drokin wrote:
> Hello!
>
> On Feb 2, 2015, at 10:44 AM, Greg Kroah-Hartman wrote:
>
> > On Mon, Feb 02, 2015 at 04:02:31PM +0300, Dan Carpenter wrote:
> >> On Sun, Feb 01, 2015 at 09:52:05PM -0500, [email protected] wrote:
> >>> From: Dmitry Eremin <[email protected]>
> >>>
> >>> Expression if (size != (ssize_t)size) is always false.
> >>> Therefore no bounds check errors detected.
> >>
> >> The original code actually worked as designed. The integer overflow
> >> could only happen on 32 bit systems and the test only was true for 32
> >> bit systems.
>
> Hm, indeed.
> Originally I fell into the trap thinking we are trying to protect against
> negative results here too. But in fact callers all check for the return
> to be negative as an error sign. Not to mention that we cannot overflow
> 64bit integer here as explained by the comment just 2 lines above the
> default patch context.
>
> >>
> >>> - if (size != (ssize_t)size)
> >>> + if (size > ~((size_t)0)>>1)
> >>> return -1;
> >>
> >> The problem is that the code was unclear. I think the new code is even
> >> more complicated to look at.
> > I agree, I don't even understand what the new code is doing.
>
> Sorry, this patch indeed should be dropped.
>
> > What is this code supposed to be protecting from? And -1? That should
> > never be a return value…
>
> Why is -1 a bad return value if all callsites check for that as an
> indication of error?
Because you should use "real" error values, don't make them up with
random negative numbers that mean nothing.
> (granted there's only one caller at this point in kernel space:
> lustre/llite/dir.c::ll_dir_ioctl()
> totalsize = hur_len(hur);
> OBD_FREE_PTR(hur);
> if (totalsize < 0)
> return -E2BIG;
> )
Shouldn't you have returned the error that hur_len() passed you?
thanks,
greg k-h
On Feb 2, 2015, at 3:51 PM, Greg Kroah-Hartman wrote:
> On Mon, Feb 02, 2015 at 03:25:58PM -0500, Oleg Drokin wrote:
>>> What is this code supposed to be protecting from? And -1? That should
>>> never be a return value?
>>
>> Why is -1 a bad return value if all callsites check for that as an
>> indication of error?
>
> Because you should use "real" error values, don't make them up with
> random negative numbers that mean nothing.
>
>> (granted there's only one caller at this point in kernel space:
>> lustre/llite/dir.c::ll_dir_ioctl()
>> totalsize = hur_len(hur);
>> OBD_FREE_PTR(hur);
>> if (totalsize < 0)
>> return -E2BIG;
>> )
>
> Shouldn't you have returned the error that hur_len() passed you?
Ok, I guess that makes quite a bit of sense too.
I'll do an updated patch about that later on then.
Thanks!
Bye,
Oleg-
On Sun, Feb 01, 2015 at 09:52:11PM -0500, [email protected] wrote:
> From: Dmitry Eremin <[email protected]>
>
> Change type of cl_conn_count to size_t.
Yes, that's what you did, but you don't say _why_ you did this.
Please fix up and resend.
thanks,
greg k-h