Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757261AbbEVL6O (ORCPT ); Fri, 22 May 2015 07:58:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48338 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756767AbbEVL6K (ORCPT ); Fri, 22 May 2015 07:58:10 -0400 Date: Fri, 22 May 2015 13:58:05 +0200 From: Christophe Fergeau To: Frediano Ziglio Cc: spice-devel@lists.freedesktop.org, David Airlie , dri-devel@lists.freedesktop.org, Dave Airlie , linux-kernel@vger.kernel.org Subject: Re: [Spice-devel] [PATCH] Do not loop on ERESTARTSYS using interruptible waits Message-ID: <20150522115805.GR20750@edamame.cdg.redhat.com> References: <1591625424.1112688.1432028990916.JavaMail.zimbra@redhat.com> <163000764.1114319.1432029294390.JavaMail.zimbra@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="JMCz+drDJ1SjddZX" Content-Disposition: inline In-Reply-To: <163000764.1114319.1432029294390.JavaMail.zimbra@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4677 Lines: 117 --JMCz+drDJ1SjddZX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hey, On Tue, May 19, 2015 at 05:54:54AM -0400, Frediano Ziglio wrote: > This problem happens using KMS surfaces and QXL driver. > To easy reproduce use KDE Plasma (which use surfaces a lot) and assure > you are using KMS surfaces (QXL driver on Fedora/RedHat has a patch to > stop using them). Open some complex application like LibreOffice and > after a while your machine get stuck using 100% CPU on Xorg. > The problem occurs as creating new surfaces not interruptible wait > are used however instead of returning ERESTARTSYS back to userspace > you try to loop but wait routines always keep returning ERESTARTSYS > once the signal is marked. > On out of memory conditions TTM module try to move objects to system > memory and QXL assure surface is updated before the move. > The fix handle differently this case using no interruptible wait so > wait functions will wait instead of returning ERESTARTSYS. > Note the when the loop occurs driver will send a lot of update requests > causing more CPU usage on Qemu side too. >=20 > Signed-off-by: Frediano Ziglio > --- > qxl/qxl_cmd.c | 12 +++--------- > qxl/qxl_drv.h | 2 +- > qxl/qxl_ioctl.c | 2 +- > 3 files changed, 5 insertions(+), 11 deletions(-) >=20 > diff --git a/drivers/gpu/drm/drivers/gpu/drm/qxl/qxl_cmd.c b/qxl/qxl_cmd.c > index 9782364..bd5404e 100644 > --- a/drivers/gpu/drm/qxl/qxl_cmd.c > +++ b/drivers/gpu/drm/qxl/qxl_cmd.c > @@ -317,14 +317,11 @@ static void wait_for_io_cmd(struct qxl_device *qdev= , uint8_t val, long port) > { > int ret; > =20 > -restart: > ret =3D wait_for_io_cmd_user(qdev, val, port, false); > - if (ret =3D=3D -ERESTARTSYS) > - goto restart; I think this one is not directly related to the fix, but can be removed because wait_for_io_cmd_user(qdev, val, port, false); will call wait_event_timeout() which cannot return ERESTARTSYS? Or was this loop causing issues too? > } > =20 > int qxl_io_update_area(struct qxl_device *qdev, struct qxl_bo *surf, > - const struct qxl_rect *area) > + const struct qxl_rect *area, bool intr) > { > int surface_id; > uint32_t surface_width, surface_height; > @@ -350,7 +347,7 @@ int qxl_io_update_area(struct qxl_device *qdev, struc= t qxl_bo *surf, > mutex_lock(&qdev->update_area_mutex); > qdev->ram_header->update_area =3D *area; > qdev->ram_header->update_surface =3D surface_id; > - ret =3D wait_for_io_cmd_user(qdev, 0, QXL_IO_UPDATE_AREA_ASYNC, true); > + ret =3D wait_for_io_cmd_user(qdev, 0, QXL_IO_UPDATE_AREA_ASYNC, intr); > mutex_unlock(&qdev->update_area_mutex); > return ret; > } > @@ -588,10 +585,7 @@ int qxl_update_surface(struct qxl_device *qdev, stru= ct qxl_bo *surf) > rect.right =3D surf->surf.width; > rect.top =3D 0; > rect.bottom =3D surf->surf.height; > -retry: > - ret =3D qxl_io_update_area(qdev, surf, &rect); > - if (ret =3D=3D -ERESTARTSYS) > - goto retry; > + ret =3D qxl_io_update_area(qdev, surf, &rect, false); My understanding is that the fix is this hunk? If so, this could be made more obvious with an intermediate commit adding the 'bool intr' arg to qxl_io_update_area and only calling it with 'true' in the appropriate places. This code path is only triggered from qxl_surface_evict() which I assume is not necessarily easily interruptible, so this change makes sense to me. However it would be much better to get a review from Dave Airlie ;) Christophe --JMCz+drDJ1SjddZX Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJVXxnNAAoJEKnYwhQprGyCBVYP/17MOp44la4MBniz0IaXjoEq ujiVSrOcDIsZH8lYwUs8hq8IskLlUWfFejzBu8u5LOYDs9IDga3cufd30KRZnJ65 g4/zLZoGloKQXTvUl/Aa7MRTJmS1YvFAek8DBHNAprRCWRqh2MCoeip5FiCpceBi nGLYJcNxRlM9KFkMZtG0WQhh4nSVqOHzFww2nrjipoKBOEZ1QhpT1DENltUflETZ 4eBmu1nvVpwlYbUotlztTl3bIvTsUrchhlXYemQe5m9kytF8aRBm3WkdO+f8oV+u dXQlJTMJNAHSAxa9AjG0OvBotbmS55Z9FhOzhIkonUfSy/JpNlyVE+D1UuNj15iI fErqeykH/7sSB4eKIrMIroK1v9CEBytZU9A5o2tBwF+kTR0aTWG83T70VlfPg2oF 2MdBvmyIp4Xg6yklPQiL1EZq/6EvbNbbaKMXCBHEntJgOc07iDbi7z4316kMe/2R Q/8AqF2prJxZ+1QS7D+YCLmRV/2ja1Wm3SFq4k/08dv4cfxM3prUS2l0nztOiOod anwCGbWcWfF49S4ggzXwQs3dZGwkYjvyH0BQPseaXEJrF+MfA/SYlpJ8fxh5fWqQ 4SiGzbUVQeurI2k63sFbyc6N5oEAlprJS8SQlD16IHyx5Wx70ann2lJ8lWzGHkp2 lJWJCg8nl5muaIoGg1tN =zxvy -----END PGP SIGNATURE----- --JMCz+drDJ1SjddZX-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/