Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753256AbcKHPIq (ORCPT ); Tue, 8 Nov 2016 10:08:46 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:36402 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751226AbcKHPIn (ORCPT ); Tue, 8 Nov 2016 10:08:43 -0500 Subject: Re: btrfs btree_ctree_super fault To: Dave Jones , Linus Torvalds , Jens Axboe , Andy Lutomirski , Andy Lutomirski , Al Viro , Josef Bacik , David Sterba , linux-btrfs , Linux Kernel , Dave Chinner References: <2bdc068d-afd5-7a78-f334-26970c91aaca@fb.com> <203e0319-bc9b-245c-e162-709267540d22@fb.com> <20161026233808.GC15247@clm-mbp.thefacebook.com> <20161026234751.e66xyzjiwifvbuha@codemonkey.org.uk> <20161031185514.b22zvbxvga4xcinz@codemonkey.org.uk> <20161031194454.GA49877@clm-mbp.thefacebook.com> <20161106165539.ybwm6rqvzh2k6uja@codemonkey.org.uk> <20161108145912.fcjvwxcpqgd7kjei@codemonkey.org.uk> From: Chris Mason Message-ID: <01d76d90-8d90-e09b-40a0-63488425348d@fb.com> Date: Tue, 8 Nov 2016 10:08:04 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20161108145912.fcjvwxcpqgd7kjei@codemonkey.org.uk> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [2620:10d:c091:180::1:a39c] X-ClientProxiedBy: BL2PR19CA0019.namprd19.prod.outlook.com (10.167.113.29) To CY4PR15MB1238.namprd15.prod.outlook.com (10.172.178.137) X-MS-Office365-Filtering-Correlation-Id: e2c090ba-3e73-4ef5-17f5-08d407e90ef3 X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1238;2:mU8n1+1ICBgWszzXZZNICzin60f9D9AHIYAVerLFS3e63KqpFsb/GvlpbrbexJqzYonmNeiV4Ouf30I2Z2u27qlGo8K+DminxM2gb6SfO2WGa3W79+H8RGOLKFYVoPzrIiI8dKep+rORFL7rk2acuSyDLnr5R5CZ/HjzFhE/gH8dSrIVp0L2689Fz3MTTwwyHQzltVACJWD9tja4v/L8LA==;3:wKTmO2ndH2xxxuRXYDUTsPq2Q5N7BYd4oi03bCs0WQTyTqQZiJz15+DDruWtBpNfcb5JVAAR4NimOV8V83G35OaHaPh1cVGuo42ICDXp8WtdYQAdW6dPGSXENFZzu2cZG3EJEpuehhr+E5geaNZ9Ng== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY4PR15MB1238; X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1238;25:aZqkKedDZQ0KEC+3xhRf9xc9+bVswqh0da0thseFjRv4igXoqLAILCuCkcEYpIu21PdnDj5KT/6Zdpf2KEs0scxnpzMkFrB2qw566+jD4jKWZNSJJq2c628Knrz0GD0+W9Q6oztXrK+UyN3HY0Ib+cI277Poa3ycUiUd5AaCf4UOyxGN5bmDIx69tugaINXhZqT8NcTgsC/5s0ewvJnfNZEkjKrVPcHyYUvuu8dcezKBbea6gxvjz3/Ns1qef/tylH98G9ZsKquCZkdNd3h2aVdxBk5R6i08VucnX3i8AHV5g01zbhoodz/PN5pmjrhG9a9YIVqVi4LGeMAKlaTh+nB3gDEoBimmcjofGP3vkko/ELshTbMQ/OqoqK8fCzyYIiB2EHn+CBVDaKW9WoTIIsr3Tddsvp8Zzdqznlw2fXClFu+MZJnQ7bToG9HU83jfGN6UcAwzJ10djFay/9ZHS5EJW+4fTlXQDJ1YFBrB4ueSTwl02Bi2PML0n8IoNIrWP7LDu3oQZ+AYprSD2t2vI2Rj3JJXxgdAJUhJ8zXL2DNbLo4afIWPZhyygnI8t2BCUDQkan9JhX/+xXPEuzeSCjiVudru063TIGvQ4t7GH7VCKmy7CGSKuiLBmqgpfZjXIJn7+c+b9FlMB9sc6AgWRJfT3JNTAG2QNRklGlOrpF8cdLpjEeOjel32yWeEJU0alFmnY7hOjsIeqEhmoUcRvZyOQBmqz0AxlsoHy83d844= X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1238;31:k2wrtsSCwGhSX61eo8QZrGSLakZ3ezQZqix0nKl7YiI+iA3RmugtbCWxcZkwbsY8zcYu6qNxov4yN7EU0UE8ubGOgls7T0hU6X8UZ7NSGtj5Z36FIAAiAnXJJZeYF/shZsaow2OkS++oyuj56Qzgu0oiT1h5l/+AaOkQ2wtnl5GBW48rmqmrLJR0+us9S11IHDepLP8neh16SZPPrWDUS1C+rqRn1/Zd2s7rEGs4fUXOPKksjB51kZhHt7BSZzPqgo8Zd2afJYVHJ0264xxiNQ==;20:FS5mcaj1QaTai903qHCmsLNcO3/vuxW0CxuzKEiPIf4sFvmoFkYFFTHuLSF/FmH8zS/NHlx9per2IAhLlzZVoXHWvI73bQnRv8d7FgVjazQYx/cx91gycBp8Ac8CSoBxnzTea1oyCBsa/AVLif0veU14G37tnHBExivoUpSIIJc=;4:/ywCeL/2IrUp+ilxk6Y+JanCv0QxImKbdde0n3F1kluy80+nJEn8Q8iBtGOlpnDjLZQNWAndOuUDpqmQMNffMslCXZsIHA/oM2ph/+ibOhqogCYWx946XgxQisrQgeKC5J9Nm2UNDbAX6lHZF57shXPt6HvifoOqO5DrdAn4AoWmQTMzHsn0ZaBTVzBiK37+8dPplYH1ioEiWx2lkXku91ivmJz68Cppwcsoex6sBCTB5uNOaO3yeUJ5NPWhiswSi5bPyyjf6khjjiDV/ifC5aC3dO5xK8YW1XFagisBIFYijSde9kBAN7+KCmyyUPXNgp49L7TK21oaR+GeFdgoAsbwV7jlw5ezvYEJfiTskHKH+gyA53Lpn6Oj5llL8RKHe1uv8qTLtybgB8w7S+EHGw== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046);SRVR:CY4PR15MB1238;BCL:0;PCL:0;RULEID:;SRVR:CY4PR15MB1238; X-Forefront-PRVS: 01208B1E18 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(199003)(24454002)(377454003)(189002)(55674003)(105586002)(106356001)(23746002)(33646002)(92566002)(6666003)(81166006)(31696002)(586003)(107886002)(189998001)(8676002)(50466002)(7736002)(6116002)(31686004)(4001350100001)(42186005)(1706002)(2906002)(97736004)(64126003)(5001770100001)(50986999)(76176999)(54356999)(7116003)(7846002)(83506001)(305945005)(101416001)(93886004)(2950100002)(68736007)(81156014)(36756003)(65956001)(65826007)(230700001)(47776003)(65806001)(86362001)(575784001)(5660300001)(77096005)(921003)(1121003)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:CY4PR15MB1238;H:[IPv6:2620:10d:c0a1:1110:8000::202a];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;CY4PR15MB1238;23:9G7N4QWi8/g11aPY7nBPTVII5gP7T1n/yaMC9?= =?Windows-1252?Q?lSoOY6JGTvct2stkJqMB6MoZeqyt6qMsa+IBldMwal+UZnlmDelUJJzh?= =?Windows-1252?Q?RWVKkIrNnyQerl9GXcVcQgG9CY5PcImlMlS/QFtdWhpyL5XsD3aWJbCt?= =?Windows-1252?Q?K8PfC3khRjMva/jGMreEfEOYZ3bDzlFwmbjYykxTL5cY8sQ/OLbt6mKJ?= =?Windows-1252?Q?fihAtfkarnKtM3VY+veuZKnSxPGyBNu9iWQPBJTPZf/cO4VPghEK4svc?= =?Windows-1252?Q?gnG801o8TZU3dY+wIO4VoIKrk+zmbI+L188hyL44xtaG0n8S2NkvDpls?= =?Windows-1252?Q?NxiwnbXLAqfRcCidhuM+FG2PNdazV6h8xRKg3NPHwFiLUWo+njlsC3Qa?= =?Windows-1252?Q?3h03eWZFJavgyEY34Wj8uqfc8HbSesDA/OyonjockEL33p82b62ivLuN?= =?Windows-1252?Q?YFg1mfbzd6crMeNlC3eIwrNP+YEj32QKcL5vcbrjGoE/YkH48XtZ/60j?= =?Windows-1252?Q?uX2Euc7iPTD9kjEWQdCdK9OawpeHcsYPLkQmMTFiFzxbkVW3uuN4gkRG?= =?Windows-1252?Q?GypO/jwF/XpImdJT4MOE4YeUoJpZkz6kdGR/SLuoQ9BCkZo/Ho/zLZog?= =?Windows-1252?Q?ds9AAa+I+eg0p0XLEafG6AxJ1iiOyQxGtWGHO3vKJ32av+F1Vsc1iePk?= =?Windows-1252?Q?kOzSCtoYJgs9fWuBT/gtR6tg0Mr1Ls6CfyWkEn+NovLdXWD5eDxe9pvK?= =?Windows-1252?Q?gpZeSdlM3tN8UCpXejfromhqvc+KwWJOZ+khoDLQ81hJZpxrN+cE+uoU?= =?Windows-1252?Q?+ndt8nKrHFj4sCIb+2lF4tG9IWbhpxQSqF5U18jQmk9xYHrdfxRwLi73?= =?Windows-1252?Q?T3A/h/NLoqnUH+93WEuITvYGR/uOzzb1muM+aNYkfbovBqA3a24+9IAV?= =?Windows-1252?Q?YNHfh+zagCaau/uuZepuOq2FdgeLZJZnu2VtPACMMsbKGTZagcGmtSQR?= =?Windows-1252?Q?2XDq0yhiC7pHE3tKqWQi1xCEHml+IIWki0pem894zXy7GwS0Y81uD14I?= =?Windows-1252?Q?AnxVbM/AFpvI3sUA78mziHh03C9XBDAJXJumrgSO7Zq98rKhFXSZkAQY?= =?Windows-1252?Q?lv8t2FWyUI14Xv3sz6Kz63pgntHU5mXv/xFbwrQLa7IqGQ2NYrqNMOEo?= =?Windows-1252?Q?PAnE0gbuJskKMxzsSrf36hXpAZQ+oFsMC+YrT97A2dajdHjhibEAb6WI?= =?Windows-1252?Q?d9QvJMNgGYvF1kYtqLysKf3M4k/FSjWdSB6ZjwKRVwRNq0N1s14ZTQDw?= =?Windows-1252?Q?AySOKcnYVE1wObfa91GENCQ2s0uLN+YZpHhwZtUhGHSdINzrtv8WKFOo?= =?Windows-1252?Q?MnS/myPbrwMms7jqM22Wryx/lwea7kkXEW6g7xtLt/+ROZkMFOOy5snu?= =?Windows-1252?Q?pfR3VdKxNJUF4NmpJF0k1NeJmDcE1TNbWEWqaUZfRw/dAk4vg4GLV6sn?= =?Windows-1252?Q?HzDc1Y=3D?= X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1238;6:V5SEAk+pRhLqAk4367ZAAXP6z2bfbz/ep/RxInu7A4E17ki1L21iDtTRWKhg3QAqk6XrTurVE1Mhu7+DuLQurJjDcHwjpIL8tOUrnKl+3uopWRdcTq+qay9NjUdmmkS+SVswvDceGGhbCEU57nR3huVYGyZSpwnl6eMbjOii2aU7cRyNjBexTTyOKTpBn0iOUcxSEyyTzci1bV8QEE2576D4fZi+Lp+TPdxlO7NwmcPcKBvHm6cm+6g6E6JLho3PPSpwCRTIlTz2EeTG7HzlrUQv0jWeOGTnKswwNP/5J8Ec7X0zs1bG9DCy138AuiL6;5:rXN56nuQKPfXekxqMeec4tNJDQJFSW2jEewanjondryR+fLqiKO1Qlg1B3X15f3xpmu/BRxr8l/uy/xW8BPHHeLap5c9yZPlqhudCiBz6DeGGWSLDTFg6FK8O1iVxmdXOS/bW6QhEv+souOhdWGWSg==;24:WWJ0gjuPyu0du8+XPJjKa+UfqHoBTtlH5hiolRSYevC/6x4C6WycM8fCCEzs6ePr+PydLZz5csktcsh9hpy6b+Lp9FHMAEQBY+FEV5oSDdk=;7:H6OPW9YlUxPfKRekRCwmPqk/8/1jtI4BJec2fIjPHk3LGFbFk6Z3OCXCsCqbMfmvb5IqyMT9GA/oIooEYGWuaiT1xP2iUhrrkGEKZmQJZ/cmaFl6VwGPjaKIvuQU8J2dbjfKKA0Q6i5haGRl6JgodvFPCW/H8D4t0/U+LWkemsX4VsOW7UW0UDRPk7csl2XsD454bGr17+p6bxCYvzTdgFpliolfbNQSsdkXfqiRcEJFEWhP9WMzn3QqlhUCR7lRxwvf4aS3YLENDFpupDrVAnAwS8GGM6DzkdiMlN+K4vK5Znl8HeNigeciKjdtSfAC7OjiF7nGtJABP0t3fsevxg2QO22B0H1i0v3cz+cyO+I= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1238;20:GlTdPlp7KZfTKmN26uABIRIPxVWzfQwtoptSQJ33SPqWGsztqV62QAtfmdqQK1EP7+s7IiUm/XkFOWP11WktEpzHtqG/TsStrYVInJ/kIa3bqZVjkWKsG4D6KpHsB2yn+cJFiYcleEXbgj9Yevvznkg1w5xEsWnNyED2q7KzOkw= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Nov 2016 15:08:10.9395 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR15MB1238 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-08_05:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3130 Lines: 63 On 11/08/2016 09:59 AM, Dave Jones wrote: > On Sun, Nov 06, 2016 at 11:55:39AM -0500, Dave Jones wrote: > > > > > > On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote: > > > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote: > > > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones wrote: > > > >> > > > >> BUG: Bad page state in process kworker/u8:12 pfn:4e0e39 > > > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 index:0x100c > > > >> flags: 0x400000000000000c(referenced|uptodate) > > > >> page dumped because: non-NULL mapping > > > > > > > >Hmm. So this seems to be btrfs-specific, right? > > > > > > > >I searched for all your "non-NULL mapping" cases, and they all seem to > > > >have basically the same call trace, with some work thread doing > > > >writeback and going through btrfs_writepages(). > > > > > > > >Sounds like it's a race with either fallocate hole-punching or > > > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ > > > >clearly ran other filesystems too but I am not seeing this backtrace > > > >for anything else. > > > > > > Agreed, I think this is a separate bug, almost certainly btrfs specific. > > > I'll work with Dave on a better reproducer. > > > > Still refining my 'capture ftrace when trinity detects taint' feature, > > but in the meantime, here's a variant I don't think we've seen before: > > And another new one: > > kernel BUG at fs/btrfs/ctree.c:3172! > invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > CPU: 0 PID: 22702 Comm: trinity-c40 Not tainted 4.9.0-rc4-think+ #1 > task: ffff8804ffde37c0 task.stack: ffffc90002188000 > RIP: 0010:[] > [] btrfs_set_item_key_safe+0x179/0x190 [btrfs] > RSP: 0000:ffffc9000218b8a8 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff8804fddcf348 RCX: 0000000000001000 > RDX: 0000000000000000 RSI: ffffc9000218b9ce RDI: ffffc9000218b8c7 > RBP: ffffc9000218b908 R08: 0000000000004000 R09: ffffc9000218b8c8 > R10: 0000000000000000 R11: 0000000000000001 R12: ffffc9000218b8b6 > R13: ffffc9000218b9ce R14: 0000000000000001 R15: ffff880480684a88 > FS: 00007f7c7f998b40(0000) GS:ffff880507800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000000 CR3: 000000044f15f000 CR4: 00000000001406f0 > DR0: 00007f4ce439d000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 > Stack: > ffff880501430000 d305ffffa00a2245 006c000000000002 0500000000000010 > 6c000000000002d3 0000000000001000 000000006427eebb ffff880480684a88 > 0000000000000000 ffff8804fddcf348 0000000000002000 0000000000000000 > Call Trace: > [] __btrfs_drop_extents+0xb00/0xe30 [btrfs] We've been hunting this one for at least two years. It's the white whale of btrfs bugs. Josef has a semi-reliable reproducer now, but I think it's not the same as the pagevec based problems you reported earlier. -chris