Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp5530353imm; Tue, 12 Jun 2018 09:10:27 -0700 (PDT) X-Google-Smtp-Source: ADUXVKINJWXTyaHTImiT8P/7wF1YSTrwxQWqmNGWsskpo/8hmezMpLWn+/uDLy3/aQtSYYgFSISt X-Received: by 2002:a63:4701:: with SMTP id u1-v6mr899359pga.34.1528819827139; Tue, 12 Jun 2018 09:10:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528819827; cv=none; d=google.com; s=arc-20160816; b=s5QvpnxokZPUd+lxL4lGYPXwy8j9FKgnKV0C9ZvBKvFaAExCb+AKXKsEcofRPtr4E9 mcohuhVBC4awNTDTrkg2tMTUakMsJ0L2795BIiDfjIB9ZwuC6xZChjHvG3dP2Jy587ab gSsicPAtof5eEp9G1lMiY0lqwuCbDgqCEzp/4P8aTTnveRO7zq/FySLCsDXbumx9rjLj n2Au0E9Kkm/P+iGUl3o6diI/e9LImRptDg49Ey+AvK/hZBAz93WWpDWwFmFnjFH6FE0E AURwvuapiUEVWipOj9VmXvhnahjUX9EtJ+ZM9WlBDHmDE96ubbAbgUErieh0QbkAe1DT nbOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:dkim-signature :arc-authentication-results; bh=I9Gg5uqro78UICNj3lDrTdUyHsvpOAf0mmwZZNookw8=; b=WBbhOXdfyVyubhKW4yw6r/Y0+5FIm9HPwtAx2Fyq7ERcHofsr3801vNsIZb6df9BJw AzzRcFWfvPnhqPKamHdib0MPb/wK3BbZpm9gRh7csiARVEg2s0WjLWehuPUTPvxE+WYt Bi1NSm550xgyL5j81dwFoWoJhbsr2yEtIdb3Fh2ZOAzxNoNxIxFS5Rx77Cksw/l02sb4 Nywdjx//6aRcVbiFuFU25V19Yjeq5H+md1sdTKi4WaN7EuxqHJNuq37a966MMV6rYi1V BoRSdTbgjXtvYyaspYWG5dzfqfJ4ltPKg9AClUJEJK4YnFk1P+7Q0ShROPhrdjqwMsts K6Bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=NxDUr3C0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o2-v6si390350pfh.327.2018.06.12.09.10.12; Tue, 12 Jun 2018 09:10:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=NxDUr3C0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934191AbeFLQJr (ORCPT + 99 others); Tue, 12 Jun 2018 12:09:47 -0400 Received: from mail-io0-f196.google.com ([209.85.223.196]:43729 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933137AbeFLQJq (ORCPT ); Tue, 12 Jun 2018 12:09:46 -0400 Received: by mail-io0-f196.google.com with SMTP id t6-v6so28697158iob.10 for ; Tue, 12 Jun 2018 09:09:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=I9Gg5uqro78UICNj3lDrTdUyHsvpOAf0mmwZZNookw8=; b=NxDUr3C0CMHUVah1klrpJWrs3Gvs8Xw1KrLHBuHOl+tnGaickLCEf0nqsGRfx/hHXw Ww6XbCBb9ZE9hasQKPw/7gKtbgjESUuHBgG7BMY5ubIrKoS6E9rlAarDVpHQHp42oF86 EUOAOdEkbOfERBuD0qez4lG99J5amYS5YlmkiMD6YkXl6vOBafQTNpJQpzc2w65zcyDL DHJ1vhJw+p29BM4eZqsN7bvYdhFr6/VhnApMetTYN5KTJP17bQRUCl0hIs8LGYYj6YJx CwGaXxT/fHQNTDbW/haUAgJToucgBzZgHVohWrhQdm9K7fpyE8goDOZZEAfJRcrepk/j RhFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=I9Gg5uqro78UICNj3lDrTdUyHsvpOAf0mmwZZNookw8=; b=s3NMZkRJwUKW7sd4zXVSpb85/qwJIzIzPVDZAw7KcOaBrrXK/4Cd6RWOtf3WB+qemm fNslOX+0S9baEDcYewmMGTDUK1ZRMdoPfknTnftXbxL4cNVZydkYNh2KFY7n5JAKUMod OjvbkSN41Ne7jxDE1HjWT2EJbeiTekDBi7lCdaRU2b4QNKgBfjPAP2go84Zj37thv7OT kL/H/b6j4ysKOLbEol/w4IU8hHf1aBpBzofuJwgNgPQF3z0NrcsZJGa3bi0yvtLpT/mv 6NeZbaEl3hp5PmOtcqpFzyFeRVmq1hdgHkHuFTLmU1LC5o0lc4aDmPNMs2S6AXkPcPI2 NDww== X-Gm-Message-State: APt69E23y2BmiLWZxGpuKQOTzlycNDXhRuGANssmhxY51r28jsZZ6J7M 746yzD/Ap2nDDl5hncnDR6UgPn2rzkM= X-Received: by 2002:a6b:4917:: with SMTP id u23-v6mr1086224iob.3.1528819785349; Tue, 12 Jun 2018 09:09:45 -0700 (PDT) Received: from [192.168.1.212] (107.191.0.158.static.utbb.net. [107.191.0.158]) by smtp.gmail.com with ESMTPSA id f193-v6sm501815itf.9.2018.06.12.09.09.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Jun 2018 09:09:44 -0700 (PDT) Subject: Re: Hard lockup in blk_mq_free_request() / wbt_done() / wake_up_all() To: Chris Boot , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org References: <9788e0e6-a448-bf85-1f41-88f42dc0071d@boo.tc> From: Jens Axboe Message-ID: Date: Tue, 12 Jun 2018 10:09:42 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <9788e0e6-a448-bf85-1f41-88f42dc0071d@boo.tc> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/12/18 9:38 AM, Chris Boot wrote: > Hi folks, > > I maintain a large (to me) system with 112 threads (4x Intel E7-4830 v4) > which has a MegaRAID SAS 9361-24i controller. This system is currently > running Debian's 4.16.12 kernel (from stretch-backports) with blk_mq > enabled. > > I've run into a lockup which appears to involve blq_mq and writeback > throttling. It's hard to tell if I've run into this same thing with > older kernels; I'm trying to track down a deadlock but so far I've been > fairly certain that involved the OOM killer, but this doesn't seem to. > > I've disabled blk_mq and I'm trying to reproduce the other lockup but in > the mean time I'm mailing about this. Is this something new that I've > stumbled across? I've had a look in the Git history from > 4.16..8efcf34a2639 but nothing obvious pops out at me. > > Console output below: > > [ 6528.523172] NMI watchdog: Watchdog detected hard LOCKUP on cpu 39 > [ 6528.523174] Modules linked in: netconsole configfs nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache bonding intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iTCO_wdt iTCO_vendor_support xfs intel_cstate ast ttm intel_uncore libcrc32c drm_kms_helper intel_rapl_perf drm sg i2c_algo_bit pcspkr tpm_tis mei_me tpm_tis_core tpm evdev joydev mei ioatdma shpchp lpc_ich wmi rng_core acp > i_pad button nfsd auth_rpcgss nfs_acl lockd grace sunrpc ipmi_si ipmi_poweroff ipmi_devintf ipmi_msghandler ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb dm_mod sd_mod hid_generic usbhid hid crc32c_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper nvme nvme_core ahci libahci ehci_pci ehci_hcd > [ 6528.523230] libata megaraid_sas usbcore scsi_mod i2c_i801 ixgbe usb_common dca mdio > [ 6528.523239] CPU: 39 PID: 0 Comm: swapper/39 Not tainted 4.16.0-0.bpo.2-amd64 #1 Debian 4.16.12-1~bpo9+1 > [ 6528.523240] Hardware name: Supermicro SYS-8048B-TR4FT/X10QBi, BIOS 3.0a 05/30/2017 > [ 6528.523250] RIP: 0010:native_queued_spin_lock_slowpath+0x115/0x190 > [ 6528.523251] RSP: 0000:ffff9001ff8c3d18 EFLAGS: 00000046 > [ 6528.523253] RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffff9001ff8e2480 > [ 6528.523254] RDX: 000000000000005f RSI: 0000000001800000 RDI: ffff8fe1fb7af9e8 > [ 6528.523254] RBP: 00000000ffc2b312 R08: 0000000000a00000 R09: 0000000000000000 > [ 6528.523255] R10: 00000000000002fb R11: 0000000000000027 R12: 0000000000000002 > [ 6528.523256] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 > [ 6528.523258] FS: 0000000000000000(0000) GS:ffff9001ff8c0000(0000) knlGS:0000000000000000 > [ 6528.523259] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 6528.523259] CR2: 0000555fc96a7708 CR3: 00000031ad40a003 CR4: 00000000003606e0 > [ 6528.523261] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 6528.523261] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 6528.523262] Call Trace: > [ 6528.523266] > [ 6528.523270] _raw_spin_lock_irqsave+0x32/0x40 > [ 6528.523275] __wake_up_common_lock+0x63/0xc0 > [ 6528.523283] wbt_done+0x6a/0x90 > [ 6528.523286] blk_mq_free_request+0xbc/0x190 > [ 6528.523302] scsi_end_request+0x95/0x1e0 [scsi_mod] > [ 6528.523308] scsi_io_completion+0x409/0x680 [scsi_mod] > [ 6528.523310] __blk_mq_complete_request+0xba/0x170 > [ 6528.523312] blk_mq_complete_request+0x50/0xa0 > [ 6528.523317] complete_cmd_fusion+0x23a/0x4a0 [megaraid_sas] > [ 6528.523321] megasas_isr_fusion+0x36/0x180 [megaraid_sas] > [ 6528.523323] __handle_irq_event_percpu+0x81/0x190 > [ 6528.523325] handle_irq_event_percpu+0x30/0x80 > [ 6528.523327] handle_irq_event+0x3c/0x60 > [ 6528.523329] handle_edge_irq+0x94/0x1f0 > [ 6528.523335] handle_irq+0x1f/0x30 > [ 6528.523339] do_IRQ+0x41/0xc0 > [ 6528.523341] common_interrupt+0xf/0xf > [ 6528.523342] > [ 6528.523346] RIP: 0010:cpuidle_enter_state+0xa7/0x2b0 > [ 6528.523346] RSP: 0000:ffff9b6c0c663e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde > [ 6528.523348] RAX: ffff9001ff8e18c0 RBX: 0000000000000002 RCX: 000000000000001f > [ 6528.523349] RDX: 000005edfeda7b89 RSI: 0000000040000219 RDI: 0000000000000000 > [ 6528.523350] RBP: ffff9001ff8eab20 R08: 000000000000004a R09: 0000000000000018 > [ 6528.523350] R10: 00000000000002da R11: 000000000000004a R12: ffffffff910b2c78 > [ 6528.523351] R13: 0000000000000002 R14: 0000000000000002 R15: 000005edfeda3109 > [ 6528.523355] do_idle+0x193/0x200 > [ 6528.523357] cpu_startup_entry+0x6f/0x80 > [ 6528.523362] start_secondary+0x1a4/0x1f0 > [ 6528.523366] secondary_startup_64+0xa5/0xb0 > [ 6528.523367] Code: 12 83 e0 03 83 ea 01 48 c1 e0 04 48 63 d2 48 05 80 24 02 00 48 03 04 d5 80 e4 ea 90 48 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 0d 09 eb 02 f3 90 8b Hmm that's really weird, I don't see how we could be spinning on the waitqueue lock like that. I haven't seen any wbt bug reports like this before. Are things generally stable if you just turn off wbt? You can do that for sda, for instance, by doing: # echo 0 > /sys/block/sda/queue/wbt_lat_usec It'd be interesting to get this data point. Eg leave blk-mq enabled, and then just disable wbt. Is anything disabling wbt in the system otherwise? -- Jens Axboe