Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp632732pxt; Thu, 12 Aug 2021 06:22:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUoRSQVUC2DYEaQEHEcOhGzD0R0pi9B4X9XkFB9unlLvqgPr94M0U3KwO0jNoki1RP7z+g X-Received: by 2002:a92:db06:: with SMTP id b6mr2320572iln.305.1628774574570; Thu, 12 Aug 2021 06:22:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628774574; cv=none; d=google.com; s=arc-20160816; b=NCRm3pqTLhtXUAE4cSleDV1BZM5h5ydoPyqIFfYdO7WnOXaQed1R6cqA/FQ97r+1UR ODmoDlwu1IWYA/C/EgtUBzpr+XSOY6NyuR9ktFuhxAdOhly2d0Ts5paC4l0sFZg+GqhM ML1XjpCFxAJvBWUJAFY2O9u+yJhiVA0VT04Zk+pA07CNar18o0bqqpCnDgxAAZUgPNtg YzqXG4KdEvpLb2bXdeJQVR+GWE5QC/urnza/4w4LL6yVdcH9ZC+TX4RKDTA55ddupqhd NA2e2aZfd7n/SqacqhdWc7mwWm1U8vpAg23vUj4goSYtbPuYZBA/NOJ5u1T83OQwQ8cA GiDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:content-transfer-encoding:mime-version :message-id:date:subject:cc:from:dkim-signature; bh=zr1ZMwzvRcUxU3ViT69L/5vEXJCAOHMhOQBirbMXXDs=; b=WNpeHq0xsxXbIcUSM3tiqlBjHy5wD552eX1owFXX4fG4bArpxH0YQCldWRU2TtwW6Z iEEZ3E7li+heFb5updlDNhu1OxbM9aQ60OJpTukGgaa6+FpkYvMXduCLVs5dNKL0ImWH 2kDTaoNaBuJohriWh7Xm9daecw1148+bG2Gmws77PblaNORxMh3nL0uC9VqoIYvvNHdi uROwGp2snqFLEno4/pQC/hTykm//R2hyh1DCNbMthTdObPO9MSNSCiI7KYaSpmDGi59B H2VR7yNyaE6N9Th7Ewr8XeP2o14d0n9dLSCL7lt/iw9H/B3nDw3PzNRFbHbGEer20qx2 wMsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@amazon.de header.s=amazon201209 header.b="NstG/pBi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w5si2794358ilh.8.2021.08.12.06.22.41; Thu, 12 Aug 2021 06:22:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@amazon.de header.s=amazon201209 header.b="NstG/pBi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237102AbhHLNKt (ORCPT + 99 others); Thu, 12 Aug 2021 09:10:49 -0400 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:26777 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235924AbhHLNKs (ORCPT ); Thu, 12 Aug 2021 09:10:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1628773824; x=1660309824; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=zr1ZMwzvRcUxU3ViT69L/5vEXJCAOHMhOQBirbMXXDs=; b=NstG/pBip5gHwMMFsJMHTLSb58tHjrhYjo37+89QhqNCblnMrNGKjPxN Xm6HJDw5LxSqRkNqWD9z7WDuRJ8Y1JSv/55q2CK6o8Zhan0Gn32bNzAHI RwbMFDYHHfCrWndegYP5gxIi6z5LpRy+jDRSseofjfCxZjJem59kq7ylG 4=; X-IronPort-AV: E=Sophos;i="5.84,315,1620691200"; d="scan'208";a="133462453" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-397e131e.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP; 12 Aug 2021 13:10:15 +0000 Received: from EX13D08EUC001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan3.pdx.amazon.com [10.236.137.198]) by email-inbound-relay-2c-397e131e.us-west-2.amazon.com (Postfix) with ESMTPS id 00A58A3AE5; Thu, 12 Aug 2021 13:10:13 +0000 (UTC) Received: from EX13MTAUEE002.ant.amazon.com (10.43.62.24) by EX13D08EUC001.ant.amazon.com (10.43.164.184) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Thu, 12 Aug 2021 13:10:12 +0000 Received: from dev-dsk-mheyne-1b-c1524648.eu-west-1.amazon.com (10.15.60.66) by mail-relay.amazon.com (10.43.62.224) with Microsoft SMTP Server id 15.0.1497.23 via Frontend Transport; Thu, 12 Aug 2021 13:10:11 +0000 Received: by dev-dsk-mheyne-1b-c1524648.eu-west-1.amazon.com (Postfix, from userid 5466572) id D8E744100E; Thu, 12 Aug 2021 13:10:11 +0000 (UTC) From: Maximilian Heyne CC: Amit Shah , Maximilian Heyne , Boris Ostrovsky , Juergen Gross , Stefano Stabellini , Wei Liu , Thomas Gleixner , Jan Beulich , Malcolm Crossley , David Vrabel , Konrad Rzeszutek Wilk , , Subject: [PATCH v2] xen/events: Fix race in set_evtchn_to_irq Date: Thu, 12 Aug 2021 13:09:27 +0000 Message-ID: <20210812130930.127134-1-mheyne@amazon.de> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There is a TOCTOU issue in set_evtchn_to_irq. Rows in the evtchn_to_irq mapping are lazily allocated in this function. The check whether the row is already present and the row initialization is not synchronized. Two threads can at the same time allocate a new row for evtchn_to_irq and add the irq mapping to the their newly allocated row. One thread will overwrite what the other has set for evtchn_to_irq[row] and therefore the irq mapping is lost. This will trigger a BUG_ON later in bind_evtchn_to_cpu: INFO: pci 0000:1a:15.4: [1d0f:8061] type 00 class 0x010802 INFO: nvme 0000:1a:12.1: enabling device (0000 -> 0002) INFO: nvme nvme77: 1/0/0 default/read/poll queues CRIT: kernel BUG at drivers/xen/events/events_base.c:427! WARN: invalid opcode: 0000 [#1] SMP NOPTI WARN: Workqueue: nvme-reset-wq nvme_reset_work [nvme] WARN: RIP: e030:bind_evtchn_to_cpu+0xc2/0xd0 WARN: Call Trace: WARN: set_affinity_irq+0x121/0x150 WARN: irq_do_set_affinity+0x37/0xe0 WARN: irq_setup_affinity+0xf6/0x170 WARN: irq_startup+0x64/0xe0 WARN: __setup_irq+0x69e/0x740 WARN: ? request_threaded_irq+0xad/0x160 WARN: request_threaded_irq+0xf5/0x160 WARN: ? nvme_timeout+0x2f0/0x2f0 [nvme] WARN: pci_request_irq+0xa9/0xf0 WARN: ? pci_alloc_irq_vectors_affinity+0xbb/0x130 WARN: queue_request_irq+0x4c/0x70 [nvme] WARN: nvme_reset_work+0x82d/0x1550 [nvme] WARN: ? check_preempt_wakeup+0x14f/0x230 WARN: ? check_preempt_curr+0x29/0x80 WARN: ? nvme_irq_check+0x30/0x30 [nvme] WARN: process_one_work+0x18e/0x3c0 WARN: worker_thread+0x30/0x3a0 WARN: ? process_one_work+0x3c0/0x3c0 WARN: kthread+0x113/0x130 WARN: ? kthread_park+0x90/0x90 WARN: ret_from_fork+0x3a/0x50 This patch sets evtchn_to_irq rows via a cmpxchg operation so that they will be set only once. The row is now cleared before writing it to evtchn_to_irq in order to not create a race once the row is visible for other threads. While at it, do not require the page to be zeroed, because it will be overwritten with -1's in clear_evtchn_to_irq_row anyway. Signed-off-by: Maximilian Heyne Fixes: d0b075ffeede ("xen/events: Refactor evtchn_to_irq array to be dynamically allocated") --- drivers/xen/events/events_base.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index d7e361fb0548..0e44098f3977 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -198,12 +198,12 @@ static void disable_dynirq(struct irq_data *data); static DEFINE_PER_CPU(unsigned int, irq_epoch); -static void clear_evtchn_to_irq_row(unsigned row) +static void clear_evtchn_to_irq_row(int *evtchn_row) { unsigned col; for (col = 0; col < EVTCHN_PER_ROW; col++) - WRITE_ONCE(evtchn_to_irq[row][col], -1); + WRITE_ONCE(evtchn_row[col], -1); } static void clear_evtchn_to_irq_all(void) @@ -213,7 +213,7 @@ static void clear_evtchn_to_irq_all(void) for (row = 0; row < EVTCHN_ROW(xen_evtchn_max_channels()); row++) { if (evtchn_to_irq[row] == NULL) continue; - clear_evtchn_to_irq_row(row); + clear_evtchn_to_irq_row(evtchn_to_irq[row]); } } @@ -221,6 +221,7 @@ static int set_evtchn_to_irq(evtchn_port_t evtchn, unsigned int irq) { unsigned row; unsigned col; + int *evtchn_row; if (evtchn >= xen_evtchn_max_channels()) return -EINVAL; @@ -233,11 +234,18 @@ static int set_evtchn_to_irq(evtchn_port_t evtchn, unsigned int irq) if (irq == -1) return 0; - evtchn_to_irq[row] = (int *)get_zeroed_page(GFP_KERNEL); - if (evtchn_to_irq[row] == NULL) + evtchn_row = (int *) __get_free_pages(GFP_KERNEL, 0); + if (evtchn_row == NULL) return -ENOMEM; - clear_evtchn_to_irq_row(row); + clear_evtchn_to_irq_row(evtchn_row); + + /* + * We've prepared an empty row for the mapping. If a different + * thread was faster inserting it, we can drop ours. + */ + if (cmpxchg(&evtchn_to_irq[row], NULL, evtchn_row) != NULL) + free_page((unsigned long) evtchn_row); } WRITE_ONCE(evtchn_to_irq[row][col], irq); -- 2.32.0 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879