Received: by 10.192.165.148 with SMTP id m20csp2518784imm; Sun, 22 Apr 2018 08:42:27 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+mqtLpRYS+BLZZH5j9EyLUIbOZXEl24SMI4RTo4CJtOaBRLpr0kLf3ffq+43LoZCtnDlWN X-Received: by 2002:a17:902:9686:: with SMTP id n6-v6mr17316967plp.136.1524411747753; Sun, 22 Apr 2018 08:42:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524411747; cv=none; d=google.com; s=arc-20160816; b=kEMbrWHUhYVKMnh5R2GJS9xD5e6tRUBdSxYjYlKlwagV8UjJCqk6LXK2Ltqe/XmmjV XYuizkWIFT85Z41cZdbJRLQXeaeg/+IG9E8CBGzpYN/Yy+DuwN4NMYp+XqoXaZ2okD30 F2U+L2oGPvaXr9dLmPeUGxxMLJyV+c+08tdbBbEJf94egxZmtZen0+vBjEhNulZTkJ54 QePHii8hdxX1HkAS6vT9fCnIw/6EWEdxuAJdxQfH4M4KnDrcSkj5YfDGcVo6BuL7hQcm 83k3dZsay3IrchWO5oEFoPQ24z6sGZq/rVVgq5Dr+0RTtF//S5TZTNQDsqozHqcQO3mN PjYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=JtqXluE0scFF6Gv6cyWEt724nngoK8tJOhmPR/vEAm4=; b=jCNg4KRSsS/mMjYq86yGL+bwaQpk8LRhhfqlIr40NJzG5FDmmX+qpxZkV1AWV4Km9O AGdalT+wFP+jwYtfdLMzV6eFzdLfLPn3ujsAqlPCYCETkRY/d6ffQ+n/AoHA+QOUf5pt tVdXNkWvwrJ/tPaPVppBRxI5LSQcEwj/9AYqDwx5sCMnJep6Z8wbxBPKu/coxSoIamdd dpZZ9zg7iASAmE2tcZXRM0XipO1rQY6GQSq3OdloWzW2RV58Ms1ReGMYJCr/6KyFhZE2 5h/fs4e59dfbf18u6aFs9ocKzDymWASE+JBrxoS/HTC17JttPG77v9aNJEc1KuuCfHv8 RUaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f11si8250914pgr.139.2018.04.22.08.42.13; Sun, 22 Apr 2018 08:42:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755503AbeDVPku (ORCPT + 99 others); Sun, 22 Apr 2018 11:40:50 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:49348 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754830AbeDVODe (ORCPT ); Sun, 22 Apr 2018 10:03:34 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 8E2E0941; Sun, 22 Apr 2018 14:03:33 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Kevin Tian , Zhenyu Wang , Weinan Li , Changbin Du Subject: [PATCH 4.16 194/196] drm/i915/gvt: init mmio by lri command in vgpu inhibit context Date: Sun, 22 Apr 2018 15:53:34 +0200 Message-Id: <20180422135114.320584004@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422135104.278511750@linuxfoundation.org> References: <20180422135104.278511750@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Weinan Li commit cd7e61b93d068a80bfe6cb55bf00f17332d831a1 upstream. There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g, vgpu can't get the correct default context by updating the registers before inhibit context submission. It always get back the hardware default value unless the inhibit context submission happened before the 1st time forcewake put. With this wrong default context, vgpu will run with incorrect state and meet unknown issues. The solution is initialize these mmios by adding lri command in ring buffer of the inhibit context, then gpu hardware has no chance to go down RC6 when lri commands are right being executed, and then vgpu can get correct default context for further use. v3: - fix code fault, use 'for' to loop through mmio render list(Zhenyu) v4: - save the count of engine mmio need to be restored for inhibit context and refine some comments. (Kevin) v5: - code rebase Cc: Kevin Tian Cc: Zhenyu Wang Signed-off-by: Weinan Li Signed-off-by: Zhenyu Wang Signed-off-by: Changbin Du Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/gvt/gvt.h | 5 drivers/gpu/drm/i915/gvt/mmio_context.c | 210 +++++++++++++++++++++++++++++--- drivers/gpu/drm/i915/gvt/mmio_context.h | 5 drivers/gpu/drm/i915/gvt/scheduler.c | 5 4 files changed, 205 insertions(+), 20 deletions(-) --- a/drivers/gpu/drm/i915/gvt/gvt.h +++ b/drivers/gpu/drm/i915/gvt/gvt.h @@ -308,7 +308,10 @@ struct intel_gvt { wait_queue_head_t service_thread_wq; unsigned long service_request; - struct engine_mmio *engine_mmio_list; + struct { + struct engine_mmio *mmio; + int ctx_mmio_count[I915_NUM_ENGINES]; + } engine_mmio_list; struct dentry *debugfs_root; }; --- a/drivers/gpu/drm/i915/gvt/mmio_context.c +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c @@ -50,6 +50,8 @@ #define RING_GFX_MODE(base) _MMIO((base) + 0x29c) #define VF_GUARDBAND _MMIO(0x83a4) +#define GEN9_MOCS_SIZE 64 + /* Raw offset is appened to each line for convenience. */ static struct engine_mmio gen8_engine_mmio_list[] __cacheline_aligned = { {RCS, GFX_MODE_GEN7, 0xffff, false}, /* 0x229c */ @@ -152,8 +154,8 @@ static struct engine_mmio gen9_engine_mm static struct { bool initialized; - u32 control_table[I915_NUM_ENGINES][64]; - u32 l3cc_table[32]; + u32 control_table[I915_NUM_ENGINES][GEN9_MOCS_SIZE]; + u32 l3cc_table[GEN9_MOCS_SIZE / 2]; } gen9_render_mocs; static void load_render_mocs(struct drm_i915_private *dev_priv) @@ -170,7 +172,7 @@ static void load_render_mocs(struct drm_ for (ring_id = 0; ring_id < ARRAY_SIZE(regs); ring_id++) { offset.reg = regs[ring_id]; - for (i = 0; i < 64; i++) { + for (i = 0; i < GEN9_MOCS_SIZE; i++) { gen9_render_mocs.control_table[ring_id][i] = I915_READ_FW(offset); offset.reg += 4; @@ -178,7 +180,7 @@ static void load_render_mocs(struct drm_ } offset.reg = 0xb020; - for (i = 0; i < 32; i++) { + for (i = 0; i < GEN9_MOCS_SIZE / 2; i++) { gen9_render_mocs.l3cc_table[i] = I915_READ_FW(offset); offset.reg += 4; @@ -186,6 +188,153 @@ static void load_render_mocs(struct drm_ gen9_render_mocs.initialized = true; } +static int +restore_context_mmio_for_inhibit(struct intel_vgpu *vgpu, + struct drm_i915_gem_request *req) +{ + u32 *cs; + int ret; + struct engine_mmio *mmio; + struct intel_gvt *gvt = vgpu->gvt; + int ring_id = req->engine->id; + int count = gvt->engine_mmio_list.ctx_mmio_count[ring_id]; + + if (count == 0) + return 0; + + ret = req->engine->emit_flush(req, EMIT_BARRIER); + if (ret) + return ret; + + cs = intel_ring_begin(req, count * 2 + 2); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_LOAD_REGISTER_IMM(count); + for (mmio = gvt->engine_mmio_list.mmio; + i915_mmio_reg_valid(mmio->reg); mmio++) { + if (mmio->ring_id != ring_id || + !mmio->in_context) + continue; + + *cs++ = i915_mmio_reg_offset(mmio->reg); + *cs++ = vgpu_vreg_t(vgpu, mmio->reg) | + (mmio->mask << 16); + gvt_dbg_core("add lri reg pair 0x%x:0x%x in inhibit ctx, vgpu:%d, rind_id:%d\n", + *(cs-2), *(cs-1), vgpu->id, ring_id); + } + + *cs++ = MI_NOOP; + intel_ring_advance(req, cs); + + ret = req->engine->emit_flush(req, EMIT_BARRIER); + if (ret) + return ret; + + return 0; +} + +static int +restore_render_mocs_control_for_inhibit(struct intel_vgpu *vgpu, + struct drm_i915_gem_request *req) +{ + unsigned int index; + u32 *cs; + + cs = intel_ring_begin(req, 2 * GEN9_MOCS_SIZE + 2); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_LOAD_REGISTER_IMM(GEN9_MOCS_SIZE); + + for (index = 0; index < GEN9_MOCS_SIZE; index++) { + *cs++ = i915_mmio_reg_offset(GEN9_GFX_MOCS(index)); + *cs++ = vgpu_vreg_t(vgpu, GEN9_GFX_MOCS(index)); + gvt_dbg_core("add lri reg pair 0x%x:0x%x in inhibit ctx, vgpu:%d, rind_id:%d\n", + *(cs-2), *(cs-1), vgpu->id, req->engine->id); + + } + + *cs++ = MI_NOOP; + intel_ring_advance(req, cs); + + return 0; +} + +static int +restore_render_mocs_l3cc_for_inhibit(struct intel_vgpu *vgpu, + struct drm_i915_gem_request *req) +{ + unsigned int index; + u32 *cs; + + cs = intel_ring_begin(req, 2 * GEN9_MOCS_SIZE / 2 + 2); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_LOAD_REGISTER_IMM(GEN9_MOCS_SIZE / 2); + + for (index = 0; index < GEN9_MOCS_SIZE / 2; index++) { + *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(index)); + *cs++ = vgpu_vreg_t(vgpu, GEN9_LNCFCMOCS(index)); + gvt_dbg_core("add lri reg pair 0x%x:0x%x in inhibit ctx, vgpu:%d, rind_id:%d\n", + *(cs-2), *(cs-1), vgpu->id, req->engine->id); + + } + + *cs++ = MI_NOOP; + intel_ring_advance(req, cs); + + return 0; +} + +/* + * Use lri command to initialize the mmio which is in context state image for + * inhibit context, it contains tracked engine mmio, render_mocs and + * render_mocs_l3cc. + */ +int intel_vgpu_restore_inhibit_context(struct intel_vgpu *vgpu, + struct drm_i915_gem_request *req) +{ + int ret; + u32 *cs; + + cs = intel_ring_begin(req, 2); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE; + *cs++ = MI_NOOP; + intel_ring_advance(req, cs); + + ret = restore_context_mmio_for_inhibit(vgpu, req); + if (ret) + goto out; + + /* no MOCS register in context except render engine */ + if (req->engine->id != RCS) + goto out; + + ret = restore_render_mocs_control_for_inhibit(vgpu, req); + if (ret) + goto out; + + ret = restore_render_mocs_l3cc_for_inhibit(vgpu, req); + if (ret) + goto out; + +out: + cs = intel_ring_begin(req, 2); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE; + *cs++ = MI_NOOP; + intel_ring_advance(req, cs); + + return ret; +} + static void handle_tlb_pending_event(struct intel_vgpu *vgpu, int ring_id) { struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv; @@ -252,11 +401,14 @@ static void switch_mocs(struct intel_vgp if (WARN_ON(ring_id >= ARRAY_SIZE(regs))) return; + if (IS_KABYLAKE(dev_priv) && ring_id == RCS) + return; + if (!pre && !gen9_render_mocs.initialized) load_render_mocs(dev_priv); offset.reg = regs[ring_id]; - for (i = 0; i < 64; i++) { + for (i = 0; i < GEN9_MOCS_SIZE; i++) { if (pre) old_v = vgpu_vreg_t(pre, offset); else @@ -274,7 +426,7 @@ static void switch_mocs(struct intel_vgp if (ring_id == RCS) { l3_offset.reg = 0xb020; - for (i = 0; i < 32; i++) { + for (i = 0; i < GEN9_MOCS_SIZE / 2; i++) { if (pre) old_v = vgpu_vreg_t(pre, l3_offset); else @@ -294,6 +446,16 @@ static void switch_mocs(struct intel_vgp #define CTX_CONTEXT_CONTROL_VAL 0x03 +bool is_inhibit_context(struct i915_gem_context *ctx, int ring_id) +{ + u32 *reg_state = ctx->engine[ring_id].lrc_reg_state; + u32 inhibit_mask = + _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT); + + return inhibit_mask == + (reg_state[CTX_CONTEXT_CONTROL_VAL] & inhibit_mask); +} + /* Switch ring mmio values (context). */ static void switch_mmio(struct intel_vgpu *pre, struct intel_vgpu *next, @@ -301,9 +463,6 @@ static void switch_mmio(struct intel_vgp { struct drm_i915_private *dev_priv; struct intel_vgpu_submission *s; - u32 *reg_state, ctx_ctrl; - u32 inhibit_mask = - _MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT); struct engine_mmio *mmio; u32 old_v, new_v; @@ -311,10 +470,18 @@ static void switch_mmio(struct intel_vgp if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) switch_mocs(pre, next, ring_id); - for (mmio = dev_priv->gvt->engine_mmio_list; + for (mmio = dev_priv->gvt->engine_mmio_list.mmio; i915_mmio_reg_valid(mmio->reg); mmio++) { if (mmio->ring_id != ring_id) continue; + /* + * No need to do save or restore of the mmio which is in context + * state image on kabylake, it's initialized by lri command and + * save or restore with context together. + */ + if (IS_KABYLAKE(dev_priv) && mmio->in_context) + continue; + // save if (pre) { vgpu_vreg_t(pre, mmio->reg) = I915_READ_FW(mmio->reg); @@ -328,16 +495,13 @@ static void switch_mmio(struct intel_vgp // restore if (next) { s = &next->submission; - reg_state = - s->shadow_ctx->engine[ring_id].lrc_reg_state; - ctx_ctrl = reg_state[CTX_CONTEXT_CONTROL_VAL]; /* - * if it is an inhibit context, load in_context mmio - * into HW by mmio write. If it is not, skip this mmio - * write. + * No need to restore the mmio which is in context state + * image if it's not inhibit context, it will restore + * itself. */ if (mmio->in_context && - (ctx_ctrl & inhibit_mask) != inhibit_mask) + !is_inhibit_context(s->shadow_ctx, ring_id)) continue; if (mmio->mask) @@ -408,8 +572,16 @@ void intel_gvt_switch_mmio(struct intel_ */ void intel_gvt_init_engine_mmio_context(struct intel_gvt *gvt) { + struct engine_mmio *mmio; + if (IS_SKYLAKE(gvt->dev_priv) || IS_KABYLAKE(gvt->dev_priv)) - gvt->engine_mmio_list = gen9_engine_mmio_list; + gvt->engine_mmio_list.mmio = gen9_engine_mmio_list; else - gvt->engine_mmio_list = gen8_engine_mmio_list; + gvt->engine_mmio_list.mmio = gen8_engine_mmio_list; + + for (mmio = gvt->engine_mmio_list.mmio; + i915_mmio_reg_valid(mmio->reg); mmio++) { + if (mmio->in_context) + gvt->engine_mmio_list.ctx_mmio_count[mmio->ring_id]++; + } } --- a/drivers/gpu/drm/i915/gvt/mmio_context.h +++ b/drivers/gpu/drm/i915/gvt/mmio_context.h @@ -49,4 +49,9 @@ void intel_gvt_switch_mmio(struct intel_ void intel_gvt_init_engine_mmio_context(struct intel_gvt *gvt); +bool is_inhibit_context(struct i915_gem_context *ctx, int ring_id); + +int intel_vgpu_restore_inhibit_context(struct intel_vgpu *vgpu, + struct drm_i915_gem_request *req); + #endif --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -275,6 +275,11 @@ static int copy_workload_to_ring_buffer( struct intel_vgpu *vgpu = workload->vgpu; void *shadow_ring_buffer_va; u32 *cs; + struct drm_i915_gem_request *req = workload->req; + + if (IS_KABYLAKE(req->i915) && + is_inhibit_context(req->ctx, req->engine->id)) + intel_vgpu_restore_inhibit_context(vgpu, req); /* allocate shadow ring buffer */ cs = intel_ring_begin(workload->req, workload->rb_len / sizeof(u32));