2023-10-11 11:02:36

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc-next 0/4] mei: enhance mei pxp recoverability

- Add timeouts to send/receive API on mei bus to avoid
infinite waits and consequent deadlock.
- Re-enable the mei pxp client upon loss.
- Retry receive operation in case of memory pressure related failure.

Alan Previn (1):
mei: update mei-pxp's component interface with timeouts

Alexander Usyskin (3):
mei: bus: add send and recv api with timeout
mei: pxp: recover from recv fail under memory pressure
mei: pxp: re-enable client on errors

drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 8 ++-
drivers/misc/mei/bus.c | 88 +++++++++++++++++++++++-
drivers/misc/mei/pxp/mei_pxp.c | 88 +++++++++++++++++++++---
include/drm/i915_pxp_tee_interface.h | 6 +-
include/linux/mei_cl_bus.h | 8 +++
5 files changed, 183 insertions(+), 15 deletions(-)

--
2.41.0


2023-10-11 11:02:37

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc-next 2/4] mei: pxp: recover from recv fail under memory pressure

From: Alexander Usyskin <[email protected]>

Under memory pressure recv fails due to kmalloc failure,
and if drivers(pxp) retry send/receive, send blocks
indefinitely.
Send without recv leaves the channel in a bad state.

Retry send attempt after small timeout and reset the channel if
the retry failed on kmalloc failure too.

Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Alan Previn <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/pxp/mei_pxp.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
index 2dcb9169e404c3bcbbccc7b4..c6cdd6a47308ebcc72f34c38 100644
--- a/drivers/misc/mei/pxp/mei_pxp.c
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -11,6 +11,7 @@
* negotiation messages to ME FW command payloads and vice versa.
*/

+#include <linux/delay.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/mei.h>
@@ -61,16 +62,38 @@ mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
{
struct mei_cl_device *cldev;
ssize_t byte;
+ bool retry = false;

if (!dev || !buffer)
return -EINVAL;

cldev = to_mei_cl_device(dev);

+retry:
byte = mei_cldev_recv(cldev, buffer, size);
if (byte < 0) {
dev_dbg(dev, "mei_cldev_recv failed. %zd\n", byte);
- return byte;
+ if (byte != -ENOMEM)
+ return byte;
+
+ /* Retry the read when pages are reclaimed */
+ msleep(20);
+ if (!retry) {
+ retry = true;
+ goto retry;
+ } else {
+ dev_warn(dev, "No memory on data receive after retry, trying to reset the channel...\n");
+ byte = mei_cldev_disable(cldev);
+ if (byte < 0)
+ dev_warn(dev, "mei_cldev_disable failed. %zd\n", byte);
+ /*
+ * Explicitly ignoring disable failure,
+ * enable may fix the states and succeed
+ */
+ byte = mei_cldev_enable(cldev);
+ if (byte < 0)
+ dev_err(dev, "mei_cldev_enable failed. %zd\n", byte);
+ }
}

return byte;
--
2.41.0

2023-10-11 11:02:38

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc-next 1/4] mei: bus: add send and recv api with timeout

From: Alexander Usyskin <[email protected]>

Add variation of the send and recv functions on bus
that define timeout.
Caller can use such functions in flow that can stuck
to bail out and not to put down the whole system.

Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Alan Previn <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/bus.c | 88 +++++++++++++++++++++++++++++++++++++-
include/linux/mei_cl_bus.h | 8 ++++
2 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 67557c67bd214415b8dc6747..f9bcff197615128d72f17590 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -257,7 +257,7 @@ ssize_t __mei_cl_recv(struct mei_cl *cl, u8 *buf, size_t length, u8 *vtag,
}

/**
- * mei_cldev_send_vtag - me device send with vtag (write)
+ * mei_cldev_send_vtag - me device send with vtag (write)
*
* @cldev: me client device
* @buf: buffer to send
@@ -278,6 +278,29 @@ ssize_t mei_cldev_send_vtag(struct mei_cl_device *cldev, const u8 *buf,
}
EXPORT_SYMBOL_GPL(mei_cldev_send_vtag);

+/**
+ * mei_cldev_send_vtag_timeout - me device send with vtag and timeout (write)
+ *
+ * @cldev: me client device
+ * @buf: buffer to send
+ * @length: buffer length
+ * @vtag: virtual tag
+ * @timeout: send timeout in milliseconds, 0 for infinite timeout
+ *
+ * Return:
+ * * written size in bytes
+ * * < 0 on error
+ */
+
+ssize_t mei_cldev_send_vtag_timeout(struct mei_cl_device *cldev, const u8 *buf,
+ size_t length, u8 vtag, unsigned long timeout)
+{
+ struct mei_cl *cl = cldev->cl;
+
+ return __mei_cl_send_timeout(cl, buf, length, vtag, MEI_CL_IO_TX_BLOCKING, timeout);
+}
+EXPORT_SYMBOL_GPL(mei_cldev_send_vtag_timeout);
+
/**
* mei_cldev_recv_vtag - client receive with vtag (read)
*
@@ -323,7 +346,49 @@ ssize_t mei_cldev_recv_nonblock_vtag(struct mei_cl_device *cldev, u8 *buf,
EXPORT_SYMBOL_GPL(mei_cldev_recv_nonblock_vtag);

/**
- * mei_cldev_send - me device send (write)
+ * mei_cldev_recv_timeout - client receive with timeout (read)
+ *
+ * @cldev: me client device
+ * @buf: buffer to receive
+ * @length: buffer length
+ * @timeout: send timeout in milliseconds, 0 for infinite timeout
+ *
+ * Return:
+ * * read size in bytes
+ * * < 0 on error
+ */
+ssize_t mei_cldev_recv_timeout(struct mei_cl_device *cldev, u8 *buf, size_t length,
+ unsigned long timeout)
+{
+ return mei_cldev_recv_vtag_timeout(cldev, buf, length, NULL, timeout);
+}
+EXPORT_SYMBOL_GPL(mei_cldev_recv_timeout);
+
+/**
+ * mei_cldev_recv_vtag_timeout - client receive with vtag (read)
+ *
+ * @cldev: me client device
+ * @buf: buffer to receive
+ * @length: buffer length
+ * @vtag: virtual tag
+ * @timeout: recv timeout in milliseconds, 0 for infinite timeout
+ *
+ * Return:
+ * * read size in bytes
+ * * < 0 on error
+ */
+
+ssize_t mei_cldev_recv_vtag_timeout(struct mei_cl_device *cldev, u8 *buf, size_t length,
+ u8 *vtag, unsigned long timeout)
+{
+ struct mei_cl *cl = cldev->cl;
+
+ return __mei_cl_recv(cl, buf, length, vtag, 0, timeout);
+}
+EXPORT_SYMBOL_GPL(mei_cldev_recv_vtag_timeout);
+
+/**
+ * mei_cldev_send - me device send (write)
*
* @cldev: me client device
* @buf: buffer to send
@@ -339,6 +404,25 @@ ssize_t mei_cldev_send(struct mei_cl_device *cldev, const u8 *buf, size_t length
}
EXPORT_SYMBOL_GPL(mei_cldev_send);

+/**
+ * mei_cldev_send_timeout - me device send with timeout (write)
+ *
+ * @cldev: me client device
+ * @buf: buffer to send
+ * @length: buffer length
+ * @timeout: send timeout in milliseconds, 0 for infinite timeout
+ *
+ * Return:
+ * * written size in bytes
+ * * < 0 on error
+ */
+ssize_t mei_cldev_send_timeout(struct mei_cl_device *cldev, const u8 *buf, size_t length,
+ unsigned long timeout)
+{
+ return mei_cldev_send_vtag_timeout(cldev, buf, length, 0, timeout);
+}
+EXPORT_SYMBOL_GPL(mei_cldev_send_timeout);
+
/**
* mei_cldev_recv - client receive (read)
*
diff --git a/include/linux/mei_cl_bus.h b/include/linux/mei_cl_bus.h
index c9af62e54577033bf9bae0e4..b38a56a13f39277f948c53b8 100644
--- a/include/linux/mei_cl_bus.h
+++ b/include/linux/mei_cl_bus.h
@@ -94,15 +94,23 @@ void mei_cldev_driver_unregister(struct mei_cl_driver *cldrv);

ssize_t mei_cldev_send(struct mei_cl_device *cldev, const u8 *buf,
size_t length);
+ssize_t mei_cldev_send_timeout(struct mei_cl_device *cldev, const u8 *buf,
+ size_t length, unsigned long timeout);
ssize_t mei_cldev_recv(struct mei_cl_device *cldev, u8 *buf, size_t length);
ssize_t mei_cldev_recv_nonblock(struct mei_cl_device *cldev, u8 *buf,
size_t length);
+ssize_t mei_cldev_recv_timeout(struct mei_cl_device *cldev, u8 *buf, size_t length,
+ unsigned long timeout);
ssize_t mei_cldev_send_vtag(struct mei_cl_device *cldev, const u8 *buf,
size_t length, u8 vtag);
+ssize_t mei_cldev_send_vtag_timeout(struct mei_cl_device *cldev, const u8 *buf,
+ size_t length, u8 vtag, unsigned long timeout);
ssize_t mei_cldev_recv_vtag(struct mei_cl_device *cldev, u8 *buf, size_t length,
u8 *vtag);
ssize_t mei_cldev_recv_nonblock_vtag(struct mei_cl_device *cldev, u8 *buf,
size_t length, u8 *vtag);
+ssize_t mei_cldev_recv_vtag_timeout(struct mei_cl_device *cldev, u8 *buf, size_t length,
+ u8 *vtag, unsigned long timeout);

int mei_cldev_register_rx_cb(struct mei_cl_device *cldev, mei_cldev_cb_t rx_cb);
int mei_cldev_register_notif_cb(struct mei_cl_device *cldev,
--
2.41.0

2023-10-11 11:02:45

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc-next 3/4] mei: pxp: re-enable client on errors

From: Alexander Usyskin <[email protected]>

Disable and enable mei-pxp client on errors to clean the internal state.

Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/misc/mei/pxp/mei_pxp.c | 70 +++++++++++++++++++++++-----------
1 file changed, 48 insertions(+), 22 deletions(-)

diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
index c6cdd6a47308ebcc72f34c38..9875d16445bb03efcfb31cd9 100644
--- a/drivers/misc/mei/pxp/mei_pxp.c
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -23,6 +23,24 @@

#include "mei_pxp.h"

+static inline int mei_pxp_reenable(const struct device *dev, struct mei_cl_device *cldev)
+{
+ int ret;
+
+ dev_warn(dev, "Trying to reset the channel...\n");
+ ret = mei_cldev_disable(cldev);
+ if (ret < 0)
+ dev_warn(dev, "mei_cldev_disable failed. %d\n", ret);
+ /*
+ * Explicitly ignoring disable failure,
+ * enable may fix the states and succeed
+ */
+ ret = mei_cldev_enable(cldev);
+ if (ret < 0)
+ dev_err(dev, "mei_cldev_enable failed. %d\n", ret);
+ return ret;
+}
+
/**
* mei_pxp_send_message() - Sends a PXP message to ME FW.
* @dev: device corresponding to the mei_cl_device
@@ -35,6 +53,7 @@ mei_pxp_send_message(struct device *dev, const void *message, size_t size)
{
struct mei_cl_device *cldev;
ssize_t byte;
+ int ret;

if (!dev || !message)
return -EINVAL;
@@ -44,10 +63,20 @@ mei_pxp_send_message(struct device *dev, const void *message, size_t size)
byte = mei_cldev_send(cldev, message, size);
if (byte < 0) {
dev_dbg(dev, "mei_cldev_send failed. %zd\n", byte);
- return byte;
+ switch (byte) {
+ case -ENOMEM:
+ fallthrough;
+ case -ENODEV:
+ fallthrough;
+ case -ETIME:
+ ret = mei_pxp_reenable(dev, cldev);
+ if (ret)
+ byte = ret;
+ break;
+ }
}

- return 0;
+ return byte;
}

/**
@@ -63,6 +92,7 @@ mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
struct mei_cl_device *cldev;
ssize_t byte;
bool retry = false;
+ int ret;

if (!dev || !buffer)
return -EINVAL;
@@ -73,26 +103,22 @@ mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
byte = mei_cldev_recv(cldev, buffer, size);
if (byte < 0) {
dev_dbg(dev, "mei_cldev_recv failed. %zd\n", byte);
- if (byte != -ENOMEM)
- return byte;
-
- /* Retry the read when pages are reclaimed */
- msleep(20);
- if (!retry) {
- retry = true;
- goto retry;
- } else {
- dev_warn(dev, "No memory on data receive after retry, trying to reset the channel...\n");
- byte = mei_cldev_disable(cldev);
- if (byte < 0)
- dev_warn(dev, "mei_cldev_disable failed. %zd\n", byte);
- /*
- * Explicitly ignoring disable failure,
- * enable may fix the states and succeed
- */
- byte = mei_cldev_enable(cldev);
- if (byte < 0)
- dev_err(dev, "mei_cldev_enable failed. %zd\n", byte);
+ switch (byte) {
+ case -ENOMEM:
+ /* Retry the read when pages are reclaimed */
+ msleep(20);
+ if (!retry) {
+ retry = true;
+ goto retry;
+ }
+ fallthrough;
+ case -ENODEV:
+ fallthrough;
+ case -ETIME:
+ ret = mei_pxp_reenable(dev, cldev);
+ if (ret)
+ byte = ret;
+ break;
}
}

--
2.41.0

2023-10-11 11:02:45

by Winkler, Tomas

[permalink] [raw]
Subject: [char-misc-next 4/4] mei: update mei-pxp's component interface with timeouts

From: Alan Previn <[email protected]>

In debugging platform or firmware related MEI-PXP connection
issues, having a timeout when clients (such as i915) calling
into mei-pxp's send/receive functions have proven useful as opposed to
blocking forever until the kernel triggers a watchdog panic (when
platform issues are experienced).

Update the mei-pxp component interface send and receive functions
to take in timeouts.

Signed-off-by: Alan Previn <[email protected]>
Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
---
drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 8 ++++--
drivers/misc/mei/pxp/mei_pxp.c | 33 +++++++++++++++++++-----
include/drm/i915_pxp_tee_interface.h | 6 +++--
3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
index 80bb0018986525f16d410e56..dfc2878426fc2226ccb55270 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -20,6 +20,8 @@
#include "intel_pxp_tee.h"
#include "intel_pxp_types.h"

+#define PXP_TRANSPORT_TIMEOUT_MS 5000 /* 5 sec */
+
static bool
is_fw_err_platform_config(u32 type)
{
@@ -71,13 +73,15 @@ static int intel_pxp_tee_io_message(struct intel_pxp *pxp,
goto unlock;
}

- ret = pxp_component->ops->send(pxp_component->tee_dev, msg_in, msg_in_size);
+ ret = pxp_component->ops->send(pxp_component->tee_dev, msg_in, msg_in_size,
+ PXP_TRANSPORT_TIMEOUT_MS);
if (ret) {
drm_err(&i915->drm, "Failed to send PXP TEE message\n");
goto unlock;
}

- ret = pxp_component->ops->recv(pxp_component->tee_dev, msg_out, msg_out_max_size);
+ ret = pxp_component->ops->recv(pxp_component->tee_dev, msg_out, msg_out_max_size,
+ PXP_TRANSPORT_TIMEOUT_MS);
if (ret < 0) {
drm_err(&i915->drm, "Failed to receive PXP TEE message\n");
goto unlock;
diff --git a/drivers/misc/mei/pxp/mei_pxp.c b/drivers/misc/mei/pxp/mei_pxp.c
index 9875d16445bb03efcfb31cd9..f77d78fa50549e69f0a0873b 100644
--- a/drivers/misc/mei/pxp/mei_pxp.c
+++ b/drivers/misc/mei/pxp/mei_pxp.c
@@ -46,10 +46,20 @@ static inline int mei_pxp_reenable(const struct device *dev, struct mei_cl_devic
* @dev: device corresponding to the mei_cl_device
* @message: a message buffer to send
* @size: size of the message
- * Return: 0 on Success, <0 on Failure
+ * @timeout_ms: timeout in milliseconds, zero means wait indefinitely.
+ *
+ * Returns: 0 on Success, <0 on Failure with the following defined failures.
+ * -ENODEV: Client was not connected.
+ * Caller may attempt to try again immediately.
+ * -ENOMEM: Internal memory allocation failure experienced.
+ * Caller may sleep to allow kernel reclaim before retrying.
+ * -EINTR : Calling thread received a signal. Caller may choose
+ * to abandon with the same thread id.
+ * -ETIME : Request is timed out.
+ * Caller may attempt to try again immediately.
*/
static int
-mei_pxp_send_message(struct device *dev, const void *message, size_t size)
+mei_pxp_send_message(struct device *dev, const void *message, size_t size, unsigned long timeout_ms)
{
struct mei_cl_device *cldev;
ssize_t byte;
@@ -60,7 +70,7 @@ mei_pxp_send_message(struct device *dev, const void *message, size_t size)

cldev = to_mei_cl_device(dev);

- byte = mei_cldev_send(cldev, message, size);
+ byte = mei_cldev_send_timeout(cldev, message, size, timeout_ms);
if (byte < 0) {
dev_dbg(dev, "mei_cldev_send failed. %zd\n", byte);
switch (byte) {
@@ -84,10 +94,21 @@ mei_pxp_send_message(struct device *dev, const void *message, size_t size)
* @dev: device corresponding to the mei_cl_device
* @buffer: a message buffer to contain the received message
* @size: size of the buffer
- * Return: bytes sent on Success, <0 on Failure
+ * @timeout_ms: timeout in milliseconds, zero means wait indefinitely.
+ *
+ * Returns: number of bytes send on Success, <0 on Failure with the following defined failures.
+ * -ENODEV: Client was not connected.
+ * Caller may attempt to try again from send immediately.
+ * -ENOMEM: Internal memory allocation failure experienced.
+ * Caller may sleep to allow kernel reclaim before retrying.
+ * -EINTR : Calling thread received a signal. Caller will need to repeat calling
+ * (with a different owning thread) to retrieve existing unclaimed response
+ * (and may discard it).
+ * -ETIME : Request is timed out.
+ * Caller may attempt to try again from send immediately.
*/
static int
-mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
+mei_pxp_receive_message(struct device *dev, void *buffer, size_t size, unsigned long timeout_ms)
{
struct mei_cl_device *cldev;
ssize_t byte;
@@ -100,7 +121,7 @@ mei_pxp_receive_message(struct device *dev, void *buffer, size_t size)
cldev = to_mei_cl_device(dev);

retry:
- byte = mei_cldev_recv(cldev, buffer, size);
+ byte = mei_cldev_recv_timeout(cldev, buffer, size, timeout_ms);
if (byte < 0) {
dev_dbg(dev, "mei_cldev_recv failed. %zd\n", byte);
switch (byte) {
diff --git a/include/drm/i915_pxp_tee_interface.h b/include/drm/i915_pxp_tee_interface.h
index a702b6ec17f7ca95eda7d225..7d96985f2d05327151b0dfc1 100644
--- a/include/drm/i915_pxp_tee_interface.h
+++ b/include/drm/i915_pxp_tee_interface.h
@@ -22,8 +22,10 @@ struct i915_pxp_component_ops {
*/
struct module *owner;

- int (*send)(struct device *dev, const void *message, size_t size);
- int (*recv)(struct device *dev, void *buffer, size_t size);
+ int (*send)(struct device *dev, const void *message, size_t size,
+ unsigned long timeout_ms);
+ int (*recv)(struct device *dev, void *buffer, size_t size,
+ unsigned long timeout_ms);
ssize_t (*gsc_command)(struct device *dev, u8 client_id, u32 fence_id,
struct scatterlist *sg_in, size_t total_in_len,
struct scatterlist *sg_out);
--
2.41.0

2023-11-15 20:40:54

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [char-misc-next 3/4] mei: pxp: re-enable client on errors

On Tue, Nov 14, 2023 at 06:40:26PM +0000, Winkler, Tomas wrote:
>
>
> > -----Original Message-----
> > From: Teres Alexis, Alan Previn <[email protected]>
> > Sent: Tuesday, November 14, 2023 5:32 PM
> > To: [email protected]; Winkler, Tomas <[email protected]>
> > Cc: [email protected]; Usyskin, Alexander
> > <[email protected]>; [email protected]; intel-
> > [email protected]; Lubart, Vitaly <[email protected]>
> > Subject: Re: [char-misc-next 3/4] mei: pxp: re-enable client on errors
> >
> > On Tue, 2023-11-14 at 16:00 +0200, Ville Syrj?l? wrote:
> > > On Wed, Oct 11, 2023 at 02:01:56PM +0300, Tomas Winkler wrote:
> > > > From: Alexander Usyskin <[email protected]>
> > > >
> > > > Disable and enable mei-pxp client on errors to clean the internal state.
> > >
> > > This broke i915 on my Alderlake-P laptop.
>
> This fix was already posted, just missed the merging window
> https://lkml.org/lkml/2023/10/31/636

Gave this a spin and it fixes the issue for me. Thanks.

>
> Greg can you please take this fix into v6.7-rc2 run, or I can repost it with the correct subject.
> Thanks
> Tomas
>
>
> > >
> >
> >
> > Hi Alex, i just relooked at the series that got merged, and i noticed that in patch
> > #3 of the series, you had changed mei_pxp_send_message to return bytes sent
> > instead of zero on success. IIRC, we had agreed to not effect the behavior of
> > this component interface (other than adding the timeout) - this was the
> > intention of Patch #4 that i was pushing for in order to spec the interface
> > (which continues to say zero on success). We should fix this to stay with the
> > original behavior - where mei-pxp should NOT send partial packets and will
> > only return zero in success case where success is sending of the complete
> > packets - so we don't need to get back the "bytes sent"
> > from mei_pxp_send_message. So i think this might be causing the problem.
> >
> >
> > Side note to Ville:, are you enabling PXP kernel config by default in all MESA
> > contexts? I recall that MESA folks were running some CI testing with enable
> > pxp contexts, but didn't realize this is being enabled by default in all contexts.
> > Please be aware that enabling pxp-contexts would temporarily disabled
> > runtime-pm during that contexts lifetime.
> > Also pxp contexts will be forced to be irrecoverable if it ever hangs.
> > The former is a hardware architecture requirement but doesn't do anything if
> > you're enabling display (which I beleive also blocks in ADL). The latter was a
> > requirement to comply with Vulkan.
> >
> > ...alan
> >
>

--
Ville Syrj?l?
Intel

2023-11-27 13:22:46

by Ville Syrjälä

[permalink] [raw]
Subject: Re: [Intel-gfx] [char-misc-next 3/4] mei: pxp: re-enable client on errors

On Wed, Nov 15, 2023 at 10:35:16PM +0200, Ville Syrj?l? wrote:
> On Tue, Nov 14, 2023 at 06:40:26PM +0000, Winkler, Tomas wrote:
> >
> >
> > > -----Original Message-----
> > > From: Teres Alexis, Alan Previn <[email protected]>
> > > Sent: Tuesday, November 14, 2023 5:32 PM
> > > To: [email protected]; Winkler, Tomas <[email protected]>
> > > Cc: [email protected]; Usyskin, Alexander
> > > <[email protected]>; [email protected]; intel-
> > > [email protected]; Lubart, Vitaly <[email protected]>
> > > Subject: Re: [char-misc-next 3/4] mei: pxp: re-enable client on errors
> > >
> > > On Tue, 2023-11-14 at 16:00 +0200, Ville Syrj?l? wrote:
> > > > On Wed, Oct 11, 2023 at 02:01:56PM +0300, Tomas Winkler wrote:
> > > > > From: Alexander Usyskin <[email protected]>
> > > > >
> > > > > Disable and enable mei-pxp client on errors to clean the internal state.
> > > >
> > > > This broke i915 on my Alderlake-P laptop.
> >
> > This fix was already posted, just missed the merging window
> > https://lkml.org/lkml/2023/10/31/636
>
> Gave this a spin and it fixes the issue for me. Thanks.
>
> >
> > Greg can you please take this fix into v6.7-rc2 run, or I can repost it with the correct subject.

We're at -rc3 already and this fix is still not in!

> > Thanks
> > Tomas
> >
> >
> > > >
> > >
> > >
> > > Hi Alex, i just relooked at the series that got merged, and i noticed that in patch
> > > #3 of the series, you had changed mei_pxp_send_message to return bytes sent
> > > instead of zero on success. IIRC, we had agreed to not effect the behavior of
> > > this component interface (other than adding the timeout) - this was the
> > > intention of Patch #4 that i was pushing for in order to spec the interface
> > > (which continues to say zero on success). We should fix this to stay with the
> > > original behavior - where mei-pxp should NOT send partial packets and will
> > > only return zero in success case where success is sending of the complete
> > > packets - so we don't need to get back the "bytes sent"
> > > from mei_pxp_send_message. So i think this might be causing the problem.
> > >
> > >
> > > Side note to Ville:, are you enabling PXP kernel config by default in all MESA
> > > contexts? I recall that MESA folks were running some CI testing with enable
> > > pxp contexts, but didn't realize this is being enabled by default in all contexts.
> > > Please be aware that enabling pxp-contexts would temporarily disabled
> > > runtime-pm during that contexts lifetime.
> > > Also pxp contexts will be forced to be irrecoverable if it ever hangs.
> > > The former is a hardware architecture requirement but doesn't do anything if
> > > you're enabling display (which I beleive also blocks in ADL). The latter was a
> > > requirement to comply with Vulkan.
> > >
> > > ...alan
> > >
> >
>
> --
> Ville Syrj?l?
> Intel

--
Ville Syrj?l?
Intel

2023-11-27 13:55:08

by Greg KH

[permalink] [raw]
Subject: Re: [Intel-gfx] [char-misc-next 3/4] mei: pxp: re-enable client on errors

On Mon, Nov 27, 2023 at 03:22:29PM +0200, Ville Syrj?l? wrote:
> On Wed, Nov 15, 2023 at 10:35:16PM +0200, Ville Syrj?l? wrote:
> > On Tue, Nov 14, 2023 at 06:40:26PM +0000, Winkler, Tomas wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Teres Alexis, Alan Previn <[email protected]>
> > > > Sent: Tuesday, November 14, 2023 5:32 PM
> > > > To: [email protected]; Winkler, Tomas <[email protected]>
> > > > Cc: [email protected]; Usyskin, Alexander
> > > > <[email protected]>; [email protected]; intel-
> > > > [email protected]; Lubart, Vitaly <[email protected]>
> > > > Subject: Re: [char-misc-next 3/4] mei: pxp: re-enable client on errors
> > > >
> > > > On Tue, 2023-11-14 at 16:00 +0200, Ville Syrj?l? wrote:
> > > > > On Wed, Oct 11, 2023 at 02:01:56PM +0300, Tomas Winkler wrote:
> > > > > > From: Alexander Usyskin <[email protected]>
> > > > > >
> > > > > > Disable and enable mei-pxp client on errors to clean the internal state.
> > > > >
> > > > > This broke i915 on my Alderlake-P laptop.
> > >
> > > This fix was already posted, just missed the merging window
> > > https://lkml.org/lkml/2023/10/31/636
> >
> > Gave this a spin and it fixes the issue for me. Thanks.
> >
> > >
> > > Greg can you please take this fix into v6.7-rc2 run, or I can repost it with the correct subject.
>
> We're at -rc3 already and this fix is still not in!

Ah, good point, I'll take it now, sorry, catching up on things...