Received: by 10.223.176.46 with SMTP id f43csp125202wra; Tue, 23 Jan 2018 17:27:58 -0800 (PST) X-Google-Smtp-Source: AH8x227nN8t+KMX2GqFeuKPjhqpl3kMK3auBSoVrMzr6y/MNySz9aSe21rQsMH+X0wZ+8s6gMIEP X-Received: by 10.101.98.26 with SMTP id d26mr9826390pgv.416.1516757277902; Tue, 23 Jan 2018 17:27:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516757277; cv=none; d=google.com; s=arc-20160816; b=oyr4oQLSZ+a3YERdQ56BQtdG50f/KjFfiWM9SrDZsPxZKCMxM8xJGd2M2IZRofqobI fqktyV++q2+qCeQ1z+Rm0hEK02uYxWE6AcUkV6ao1xNIIiSmebcqyE6CXRqqZ7X1uJOf NhqeL8LzzFXMIcQWvhtzdnEY07xeYA5AIOQYmVQaoLS/80YSjInnisNu73c/vm3JZM5g J2ZjrIaQoyiunqMyUz+mWARynTZHB17f2pkEF5e8stdqPlCLVoKpZAsbOZt9wdxOOemM 5XluSK/POr+sW71Ku9iUDprGKv/3noP23hFbRgtR6wOcDeOZcPtekQXr8/YoEct6mNwZ JIdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:date:cc:to:reply-to:from :subject:message-id:arc-authentication-results; bh=4RKGYJLhvgpFIa7UUHhCTyZIxnYah/sg54+8yXETu+o=; b=HqeIF6zXdhR1deGTHN55HroTox91p4DKkSza3+EB1iAZrxDTUuOsGeInxg01QZAjm4 NtcHddyNFcTBTcQ/qlJq9VPdoVjj1ahQdazWv23i1dY7NMijE5YNPklqAKKitFlxWdwE MA+ikthDkU8awFRY2Wit/iZEDEUfxRRw9ZdbnlWfT/pLxrXwPbJbXbVWh6wuWykGPGhv +OycK6Pt3sKiYqJIjvSTVHNKJEj7leRXZT53PgTONAIF9nuvwxRl+BO6J9nQVKrdR22J MufF0XOBSpgnEpm/Gdp/6hBgDrUon0HP/VjFGdBYmGJ1im45ODJL7MRuE0DEHKcwDNbY P4EA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b96-v6si204506pli.765.2018.01.23.17.27.43; Tue, 23 Jan 2018 17:27:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752086AbeAXB1C (ORCPT + 99 others); Tue, 23 Jan 2018 20:27:02 -0500 Received: from mail-qt0-f193.google.com ([209.85.216.193]:44969 "EHLO mail-qt0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751432AbeAXB1B (ORCPT ); Tue, 23 Jan 2018 20:27:01 -0500 Received: by mail-qt0-f193.google.com with SMTP id l20so6472314qtj.11 for ; Tue, 23 Jan 2018 17:27:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:subject:from:reply-to:to:cc:date :in-reply-to:references:organization:mime-version :content-transfer-encoding; bh=4RKGYJLhvgpFIa7UUHhCTyZIxnYah/sg54+8yXETu+o=; b=JFQUZ6XbgPh8i30pXL6l5az7fvvW5RKokjWKd0RtwwignGOTg9XGodHax9zyVZE0aV gZLq/I6Hwz/VfMNEpjyl9k6QO25nMk2eJ26zGHjlft5DubfoIEfkm+oECJ6Q60XH6xA8 Sbr0HPdETQrOMmPY3hlFvAAuKnmNnFQwnggwTmZFaP8Tlzg7Rhp8n2I0ITM1l/rFz7z+ HTscSSrBAWpXyxxJmNdpRwWdF78TrfP2DfIXg8GuJ8Af3MEOz1/nWusezXzdDDsBvx3F Fy+IyfbWjjCbl6GABnqo0pdnpX0Q3fqrCdrHYPI2dgUB6zqP6Bxkvkq8Acl1V9o5lt5J bBxg== X-Gm-Message-State: AKwxytciRiu6rKpNtXCrFDHyAS6TbNnkP7uNiXSJ7bv2tiEbzGW/hLgs z4QBqiBLtD0ZUOkm3GsmOOr3KQ== X-Received: by 10.237.41.69 with SMTP id s63mr6592038qtd.218.1516757220591; Tue, 23 Jan 2018 17:27:00 -0800 (PST) Received: from whitewolf.lyude.net (pool-108-26-161-12.bstnma.fios.verizon.net. [108.26.161.12]) by smtp.gmail.com with ESMTPSA id o62sm10848960qke.20.2018.01.23.17.26.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 23 Jan 2018 17:27:00 -0800 (PST) Message-ID: <1516757219.29151.7.camel@redhat.com> Subject: Re: "irq/matrix: Spread interrupts on allocation" breaks nouveau in mainline kernel From: Lyude Paul Reply-To: lyude@redhat.com To: Thomas Gleixner Cc: hpa@zytor.com, keith.busch@intel.com, mingo@kernel.org, "linux-kernel@vger.kernel.org" Date: Tue, 23 Jan 2018 20:26:59 -0500 In-Reply-To: <1516744873.29151.3.camel@redhat.com> References: <1516744873.29151.3.camel@redhat.com> Organization: Red Hat Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.4 (3.26.4-1.fc27) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org JFYI: I confirmed this patch is definitely broken. I'm seeing nouveau get assigned the same MSI vector as another device on the system, which would explain why interrupts suddenly stop working. I'll keep looking into it further tomorrow. On Tue, 2018-01-23 at 17:01 -0500, Lyude Paul wrote: > Hi! Sorry to be the bearer of bad news, but this patch actually seems to break > suspending and resuming with nouveau on my machine: > > [ 29.694755] PM: suspend entry (deep) > [ 29.694773] PM: Syncing filesystems ... done. > [ 29.696203] Freezing user space processes ... (elapsed 0.001 seconds) done. > [ 29.697442] OOM killer disabled. > [ 29.697448] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) > done. > [ 29.698232] Suspending console(s) (use no_console_suspend to debug) > [ 29.698993] serial 00:05: disabled > [ 29.708227] sd 4:0:0:0: [sda] Synchronizing SCSI cache > [ 29.708428] sd 4:0:0:0: [sda] Stopping disk > [ 30.614581] ACPI: Preparing to enter system sleep state S3 > [ 30.917726] PM: Saving platform NVS memory > [ 30.917736] Disabling non-boot CPUs ... > [ 30.925616] smpboot: CPU 1 is now offline > [ 30.936915] smpboot: CPU 2 is now offline > [ 30.952824] smpboot: CPU 3 is now offline > [ 30.964764] smpboot: CPU 4 is now offline > [ 30.980663] smpboot: CPU 5 is now offline > [ 30.992692] smpboot: CPU 6 is now offline > [ 31.002572] smpboot: CPU 7 is now offline > [ 31.003130] ACPI: Low-level resume complete > [ 31.003180] PM: Restoring platform NVS memory > [ 31.003578] WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 > smp_call_function_single+0xdc/0xe0 > [ 31.003578] Modules linked in: nouveau video mxm_wmi i2c_algo_bit ttm > drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm vfat fat > usbhid > crc32_pclmul i2c_piix4 i2c_core shpchp k10temp wmi acpi_cpufreq crc32c_intel > r8169 mii xhci_pci xhci_hcd w83627hf_wdt > [ 31.003590] CPU: 0 PID: 11523 Comm: rtcwake Not tainted 4.15.0-rc8nouveau- > clockgating+ #1 > [ 31.003591] Hardware name: MSI MS-7A39/A320M GAMING PRO (MS-7A39), BIOS > 1.60 > 09/19/2017 > [ 31.003592] RIP: 0010:smp_call_function_single+0xdc/0xe0 > [ 31.003593] RSP: 0018:ffffc900004a3c40 EFLAGS: 00010046 > [ 31.003594] RAX: 0000000000000000 RBX: ffffc900004a3cdc RCX: > 0000000000000001 > [ 31.003594] RDX: ffffc900004a3c98 RSI: ffffffff8137a180 RDI: > 0000000000000000 > [ 31.003595] RBP: ffffc900004a3c70 R08: 0000000000000001 R09: > 0000000000010000 > [ 31.003595] R10: ffffc900004a3c98 R11: 0000000000000000 R12: > 0000000000000000 > [ 31.003596] R13: 0000000001000000 R14: ffffc900004a3d0c R15: > 0000000000000000 > [ 31.003597] FS: 00007f03bee93540(0000) GS:ffff88021ae00000(0000) > knlGS:0000000000000000 > [ 31.003597] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 31.003598] CR2: 00007fffb6673008 CR3: 000000020ddd4000 CR4: > 00000000003406f0 > [ 31.003598] Call Trace: > [ 31.003603] ? rdmsr_safe_on_cpu+0x4b/0x70 > [ 31.003604] rdmsr_safe_on_cpu+0x4b/0x70 > [ 31.003606] get_block_address.isra.0+0x6e/0xe0 > [ 31.003607] mce_amd_feature_init+0x63/0x2c0 > [ 31.003609] mce_syscore_resume+0x1e/0x30 > [ 31.003611] syscore_resume+0x4b/0x170 > [ 31.003613] suspend_devices_and_enter+0x608/0x7e0 > [ 31.003614] pm_suspend+0x315/0x380 > [ 31.003615] state_store+0x7d/0xe0 > [ 31.003618] kernfs_fop_write+0xfa/0x180 > [ 31.003620] __vfs_write+0x23/0x130 > [ 31.003623] ? SYSC_newfstat+0x29/0x40 > [ 31.003625] ? _cond_resched+0x15/0x40 > [ 31.003626] vfs_write+0xad/0x1a0 > [ 31.003627] SyS_write+0x42/0x90 > [ 31.003629] entry_SYSCALL_64_fastpath+0x24/0x87 > [ 31.003630] RIP: 0033:0x7f03be9ae8f4 > [ 31.003631] RSP: 002b:00007ffe6bf825f8 EFLAGS: 00000246 > [ 31.003632] Code: fe ff ff 8b 55 e8 83 e2 01 74 0a f3 90 8b 55 e8 83 e2 01 > 75 > f6 48 83 c4 28 41 5a 5d 49 8d 62 f8 c3 8b 05 58 b6 48 01 85 c0 75 86 <0f> ff > eb > 82 0f 1f 44 00 00 f6 46 18 01 75 15 c7 46 18 01 00 00 > [ 31.003648] ---[ end trace 19fa2f7781ed5237 ]--- > [ 31.004025] Enabling non-boot CPUs ... > [ 31.004052] x86: Booting SMP configuration: > [ 31.004052] smpboot: Booting Node 0 Processor 1 APIC 0x1 > [ 31.006368] cache: parent cpu1 should not be sleeping > [ 31.006442] microcode: CPU1: patch_level=0x08001129 > [ 31.006509] CPU1 is up > [ 31.006525] smpboot: Booting Node 0 Processor 2 APIC 0x2 > [ 31.008832] cache: parent cpu2 should not be sleeping > [ 31.008894] microcode: CPU2: patch_level=0x08001129 > [ 31.008966] CPU2 is up > [ 31.008975] smpboot: Booting Node 0 Processor 3 APIC 0x3 > [ 31.011264] cache: parent cpu3 should not be sleeping > [ 31.011329] microcode: CPU3: patch_level=0x08001129 > [ 31.011404] CPU3 is up > [ 31.011413] smpboot: Booting Node 0 Processor 4 APIC 0x8 > [ 31.013833] cache: parent cpu4 should not be sleeping > [ 31.013903] microcode: CPU4: patch_level=0x08001129 > [ 31.014025] CPU4 is up > [ 31.014036] smpboot: Booting Node 0 Processor 5 APIC 0x9 > [ 31.016354] cache: parent cpu5 should not be sleeping > [ 31.016421] microcode: CPU5: patch_level=0x08001129 > [ 31.016534] CPU5 is up > [ 31.016544] smpboot: Booting Node 0 Processor 6 APIC 0xa > [ 31.018857] cache: parent cpu6 should not be sleeping > [ 31.018930] microcode: CPU6: patch_level=0x08001129 > [ 31.019047] CPU6 is up > [ 31.019057] smpboot: Booting Node 0 Processor 7 APIC 0xb > [ 31.021376] cache: parent cpu7 should not be sleeping > [ 31.021444] microcode: CPU7: patch_level=0x08001129 > [ 31.021579] CPU7 is up > [ 31.022166] ACPI: Waking up from system sleep state S3 > [ 31.070791] usb usb1: root hub lost power or was reset > [ 31.070794] usb usb2: root hub lost power or was reset > [ 31.071628] serial 00:05: activated > [ 31.080265] sd 4:0:0:0: [sda] Starting disk > [ 31.126099] hpet_rtc_timer_reinit: 68 callbacks suppressed > [ 31.126099] hpet1: lost 2 rtc interrupts > [ 31.160913] r8169 0000:1e:00.0 enp30s0: link down > [ 31.255563] do_IRQ: 1.35 No irq handler for vector > [ 31.379537] ata6: SATA link down (SStatus 0 SControl 300) > [ 31.379558] ata1: SATA link down (SStatus 0 SControl 300) > [ 31.380306] ata2: SATA link down (SStatus 0 SControl 300) > [ 31.435705] ata9: SATA link down (SStatus 0 SControl 300) > [ 31.589932] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > [ 31.590320] ata5.00: configured for UDMA/133 > [ 31.610043] usb 1-4: reset low-speed USB device number 2 using xhci_hcd > [ 32.226138] usb 1-5: reset low-speed USB device number 3 using xhci_hcd > [ 33.257867] nouveau 0000:22:00.0: DRM: EVO timeout > [ 34.237185] r8169 0000:1e:00.0 enp30s0: link up > [ 35.257880] nouveau 0000:22:00.0: DRM: base-0: timeout > [ 37.258334] nouveau 0000:22:00.0: DRM: base-0: timeout > [ 37.276084] OOM killer enabled. > [ 37.276612] Restarting tasks ... done. > [ 37.277722] PM: suspend exit > > I haven't yet actually investigated why it does this, but a bisect of master > led > me to here. >