Received: by 2002:a05:7412:8d11:b0:fa:4934:9f with SMTP id bj17csp650157rdb; Mon, 15 Jan 2024 09:01:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IGGk2iQ/QYuDh6Aa9uSwbxLEEV6s0rKD9jXR+N4/c88T4d4X7FwvWGSnpRn95LkXxYesOcC X-Received: by 2002:a05:6a00:1897:b0:6db:10a:6679 with SMTP id x23-20020a056a00189700b006db010a6679mr10026156pfh.51.1705338112055; Mon, 15 Jan 2024 09:01:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705338112; cv=none; d=google.com; s=arc-20160816; b=SVI+is3PGifucGbCrytRcRHcZrB2hbGtyqugym0Ng8PWgbmcYqOA+2+jwWO31Vv7Po SYYQYA5D7q5Zfyq5sDknoVI/Sb7c14lVwtGt3WgljprLXGSM8jxlgmMAXhkZy1nGN7y2 KGd60i08gutOQ7q6bKJfFuIF1Gz5gM8oeOJB7YCH36i28MYP+lYBwgYYYDfr3L61yd31 tuig+YoX6vTLA9W9TM0CpTalaBLmg1Qrl10jWYzlaoREc4d4hLp99anXN6Qfwk3JJNR4 wfggpPstPwMfjSay/hvn7PekqUXOTtiNBVFvVyIZaHanyo//cLdTn4CuQqM0x9xFbtH2 mjdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=gCQXqXnSxd4P6DNAZAys5H0Dsn4jUn7TmL5XzvJGglg=; fh=CgfznqiYWS0tUNN0zxOw7tNc8IuefZFHsaeYmOY6nks=; b=lymCjoSMz7rOR9X58U4tSkSPiBRMLad10Cl8YfTrQLlqk4DyAbCQf+6SUKJCy4EuDM 4VJqy/x+7mLsiAXFUx/I3AKpnARmVdUte0KRIlaU74LMBfL/5Xa3+dG5BMKEhgKsyKCK hLif0deVhvrl1NFAEVJRsmVpV8M/teesu+cvYM07sQZTA/pi6JuUNnmP+7wjYYIyU4EP qvj0zo33QwEj19NxMjY8R7yCQ+n/4JuZwG7jLbsyZNiRU41gxcKMCC3PHG6CYrPV9Cn0 EA/TzDpIqUaxx8mOpfwN6RRFaBA4iDvTevYPOAUatyvuYHRJpCog6q6NUhRGSGArc218 3Qsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kzEIH7ib; spf=pass (google.com: domain of linux-kernel+bounces-26286-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26286-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id lp8-20020a056a003d4800b006db663fc753si5478244pfb.231.2024.01.15.09.01.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 09:01:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-26286-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kzEIH7ib; spf=pass (google.com: domain of linux-kernel+bounces-26286-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26286-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id E7184B219C0 for ; Mon, 15 Jan 2024 17:01:45 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 269611802F; Mon, 15 Jan 2024 17:01:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kzEIH7ib" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D3D618021 for ; Mon, 15 Jan 2024 17:01:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705338095; x=1736874095; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=m8GjS5wZqsYvI+/FgqsVOw65DnUXGTviCMTTBgL69MQ=; b=kzEIH7ibo/iYH14Q2vK54qvN2aWM171aafH6w3r89WhADD1e1IjPhdXG P1yim7KokZChiuf5QJj5J4b2SsHCIVmwQ6ZV4hRN91QcllZK6nVjrTotm Fi3s2c9s55qI7ok5Ago0suO0yckmaVA49mTyLZO7SxGQG/y4K8Bf76ozV Z7Ah1b3TmxgkqOsT70DcPbqqyjNH9lJMPy3yiHjx/JB7aQQ9HtTpxStnG jG5DmHESELI4/4JMmgfNX5CiNNO1PbpHYeG4pCGFfvl24geKLf+eNOeco qTAIeZMF+HPssVA8McfRDP4sn8bLoGArBr9BlF4/89XELV/EV+B4zEGFe Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10954"; a="6408192" X-IronPort-AV: E=Sophos;i="6.04,197,1695711600"; d="scan'208";a="6408192" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 09:01:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10954"; a="907101286" X-IronPort-AV: E=Sophos;i="6.04,197,1695711600"; d="scan'208";a="907101286" Received: from mleonvig-mobl1.ger.corp.intel.com (HELO localhost.localdomain) ([10.213.223.101]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2024 09:01:28 -0800 From: Tvrtko Ursulin To: linux-kernel@vger.kernel.org, tvrtko.ursulin@linux.intel.com Cc: Tvrtko Ursulin , Peter Zijlstra , Umesh Nerlige Ramappa , Aravind Iddamsetty Subject: [RFC 0/3] Fixing i915 PMU use after free after driver unbind Date: Mon, 15 Jan 2024 17:01:17 +0000 Message-Id: <20240115170120.662220-1-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.40.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: Tvrtko Ursulin Hi Peter, all, This is an early RFC to outline a newly discovered problem in the current handling of driver unbind with active perf fds. The sequence is basically this: 1. Open a perf fd 2. Unbind a driver 3. Close the dangling fd Or a slightly more evil variant: 1. Open a perf fd 2. Unbind a driver 3. Bind the driver again 4. Close the dangling fd I thought we had this covered by recording the unbound status (pmu->closed in i915_pmu.c) and making sure the struct i915_pmu (and struct perf_pmu) remain active until the last event is closed (via internal reference counting). But what I missed until now are two things: 1) core.c: _free_event() will dereference event->pmu _after_ event->destroy(). KASAN catches this easily and patches 1 & 2 are the attempt to fix that. 2) A more evil case where pmu->cpu_pmu_context per-cpu allocation gets re-used _before_ the old perf fd is closed. There things can nicely explode on the list_del_init inside event_sched_out on list_del_init(&event->active_list); (with list debugging turned on of course). Most easily reproducible by simply re-binding i915, which happens to grab the same per-cpu block and then the new perf_pmu_register zaps the list_head which the old event will try to unlink itself from. This is what the third patch attempts to deal with. It is a bit incomplete though, as I was unsure what is the best approach to fix and so thought to send it out early for some guidance. Cc: Peter Zijlstra Cc: Umesh Nerlige Ramappa Cc: Aravind Iddamsetty Tvrtko Ursulin (3): perf: Add new late event free callback drm/i915/pmu: Move i915 reference drop to new event->free() perf: Reference count struct perf_cpu_pmu_context to fix driver unbind drivers/gpu/drm/i915/i915_pmu.c | 4 ++-- include/linux/perf_event.h | 2 ++ kernel/events/core.c | 34 ++++++++++++++++++++++++--------- 3 files changed, 29 insertions(+), 11 deletions(-) -- 2.40.1