Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752037AbdFTKQK (ORCPT ); Tue, 20 Jun 2017 06:16:10 -0400 Received: from parrot.pmhahn.de ([88.198.50.102]:40740 "EHLO parrot.pmhahn.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750979AbdFTKQH (ORCPT ); Tue, 20 Jun 2017 06:16:07 -0400 X-Greylist: delayed 429 seconds by postgrey-1.27 at vger.kernel.org; Tue, 20 Jun 2017 06:16:06 EDT Subject: Re: [Qemu-devel] [RFH] qemu-2.6 memory corruption with OVMF and linux-4.9 To: Philipp Hahn , Laszlo Ersek References: <5d090b82-dae7-ac67-a032-92c2e776b70f@univention.de> <2e7e9fe3-e603-d75f-84c6-d0fb048266da@redhat.com> <58d3a273-e857-fe9e-0b1e-a4aca4aa54ef@univention.de> Cc: qemu-devel@nongnu.org, "linux-kernel@vger.kernel.org" , Peter Jones , linux-fbdev@vger.kernel.org From: Philipp Hahn Message-ID: Date: Tue, 20 Jun 2017 12:08:56 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <58d3a273-e857-fe9e-0b1e-a4aca4aa54ef@univention.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2553 Lines: 63 Hello, Am 18.06.2017 um 20:22 schrieb Philipp Hahn: > Am 17.06.2017 um 18:51 schrieb Laszlo Ersek: >> (I also recommend using the "vbindiff" tool for such problems, it is >> great for picking out patterns.) >> >> ** ** ** ** ** ** ** ** 8 9 ** ** ** 13 14 15 >> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- >> 00000000 01 e8 00 00 00 00 00 00 8c 5e 00 00 00 10 ff f1 >> 00000010 5b 78 8a 3e 00 00 00 00 00 00 00 00 00 00 00 00 >> 00000020 8c 77 00 00 00 12 00 02 18 f0 00 00 00 00 00 00 >> 00000030 00 1e 00 00 00 00 00 00 8c 8c 00 00 00 12 00 02 >> 00000040 07 70 00 00 00 00 00 00 00 14 00 00 00 00 00 00 >> 00000050 8c 9c 00 00 00 12 00 02 22 00 00 00 00 00 00 00 >> 00000060 00 40 00 00 00 00 00 00 8c ac 00 00 00 10 ff f1 >> >> 00000000 01 e8 00 00 00 00 00 00 00 3c 00 00 00 17 00 00 >> 00000010 5b 78 8a 3e 00 00 00 00 00 3c 00 00 00 07 00 00 >> 00000020 8c 77 00 00 00 12 00 02 00 3c 00 00 00 07 00 00 >> 00000030 00 1e 00 00 00 00 00 00 00 3c 00 00 00 17 00 00 >> 00000040 07 70 00 00 00 00 00 00 00 3c 00 00 00 07 00 00 >> 00000050 8c 9c 00 00 00 12 00 02 00 3c 00 00 00 07 00 00 >> 00000060 00 40 00 00 00 00 00 00 00 3c 00 00 00 17 00 00 >> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- >> ** ** ** ** ** ** ** ** 8 9 ** ** ** 13 14 15 >> >> The columns that I marked with "**" are identical between "good" and >> "bad". (These are columns 0-7, 10-12.) >> >> Column 8 is overwritten by zeros (every 16th byte). >> >> Column 9 is overwritten by 0x3c (every 16th byte). >> >> Column 13 is super interesting. The most significant nibble in that >> column is not disturbed. And, in the least significant nibble, the least >> significant three bits are turned on. Basically, the corruption could be >> described, for this column (i.e., every 16th byte), as >> >> bad = good | 0x7 >> >> Column 14 is overwritten by zeros (every 16th byte). >> >> Column 15 is overwritten by zeros (every 16th byte). >> >> My take is that your host machine has faulty RAM. Please run memtest86+ >> or something similar. > > I will do so, but for me very unlikely: > - it never happens with BIOS, only with OVMF > - for each test I start q new QEMU process, which should use a different > memory region > - it repeatedly hits e1000 or libata.ko Okay: memtest+-5.01 run for 8h and did not find any errors in those 3 passes. > After updating from OVMF to 0~20161202.7bbe0b3e-1 from > (0~20160813.de74668f-2 it has not yet happened again. Anyway, thank you for your help. Philipp