2010-11-01 12:57:46

by Jan Kiszka

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

[ Forgot to CC LKML - maybe it's not KVM-specific.
BTW, is anyone actually using current KVM device assigment on
Intel? I'm starting to believe that can only very few lucky people...
]

Am 01.11.2010 13:51, Jan Kiszka wrote:
> Hi again,
>
> OK, I swapped those two lines in intel_iommu_attach_device [1], fixed
> another warning in the wbinvd emulation, but now I'm about to give up.
> This is freaky MMU stuff:
>
>
> general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> last sysfs file: /sys/devices/pci0000:00/0000:00:1a.0/device
> CPU 1
> Modules linked in: kvm_intel kvm bluetooth snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device edd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec ath9k_common ath9k_hw snd_hwdep snd_pcm snd_timer snd ath pcmcia cfg80211 sdhci_pci tpm_infineon sdhci yenta_socket firewire_ohci tpm_tis mmc_core soundcore e1000e sg pcmcia_rsrc tpm firewire_core iTCO_wdt video snd_page_alloc pcmcia_core i2c_i801 rfkill tpm_bios intel_agp output fujitsu_laptop iTCO_vendor_support i2c_core serio_raw pcspkr led_class joydev crc_itu_t button battery intel_gtt ac ext4 mbcache jbd2 crc16 sha256_generic aes_x86_64 aes_generic cbc dm_crypt linear sd_mod crc_t10dif dm_snapshot dm_mod fan processor ahci libahci ata_gene
ri
> c liba
> ta scsi_mod thermal thermal_sys hwmon
> Nov 1 13:19:11 mchn199C kernel:
> Pid: 2248, comm: qemu-system-x86 Not tainted 2.6.36+ #12 FJNB211W/CELSIUS H700
> RIP: 0010:[<ffffffff8121de8c>] [<ffffffff8121de8c>] pfn_to_dma_pte+0x73/0x190
> RSP: 0018:ffff8800bd4bdb68 EFLAGS: 00010202
> RAX: ffff1000bd4fe000 RBX: ffff1000bd4fec00 RCX: 0000000000000009
> RDX: 0000000000000180 RSI: ffff88012a940938 RDI: 0000000000000202
> RBP: ffff8800bd4bdba8 R08: ffffea00025ac2a0 R09: 0000000000000004
> R10: 0000000000000001 R11: 0000000000000000 R12: ffff880128dfee00
> R13: 0000000000000002 R14: 00000000000f0000 R15: 0000000000000009
> FS: 00007f4990d33710(0000) GS:ffff8800be680000(0000) knlGS:0000000000000000
> CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> CR2: 0000000001408000 CR3: 00000000bd7db000 CR4: 00000000000026e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process qemu-system-x86 (pid: 2248, threadinfo ffff8800bd4bc000, task ffff88012a940000)
> Stack:
> ffff8800bd4bdb88 ffff8800ac378000 00000000000000d2 00000000000f0000
> <0> ffff8800bd4bdce8 ffff8800ac2fc000 0000000000000003 00000000000f0001
> <0> ffff8800bd4bdbb8 ffffffff8121dfbe ffff8800bd4bdbc8 ffffffff812a9573
> Call Trace:
> [<ffffffff8121dfbe>] intel_iommu_iova_to_phys+0x15/0x2a
> [<ffffffff812a9573>] iommu_iova_to_phys+0x13/0x15
> [<ffffffffa04e91d0>] kvm_iommu_map_pages+0x77/0x194 [kvm]
> [<ffffffff8111404d>] ? __vmalloc_node+0x86/0x9b
> [<ffffffffa04e30e2>] __kvm_set_memory_region+0x4e5/0x787 [kvm]
> [<ffffffff81081ee8>] ? mark_held_locks+0x50/0x72
> [<ffffffff8137c983>] ? mutex_lock_nested+0x325/0x34d
> [<ffffffffa04e33bb>] kvm_set_memory_region+0x37/0x50 [kvm]
> [<ffffffffa04e4c15>] kvm_vm_ioctl_set_memory_region+0x18/0x1a [kvm]
> [<ffffffffa04e4e44>] kvm_vm_ioctl+0x22d/0x3b1 [kvm]
> [<ffffffff811355aa>] ? fget_light+0x17b/0x31f
> [<ffffffff81143bd7>] do_vfs_ioctl+0x4c6/0x507
> [<ffffffff81135732>] ? fget_light+0x303/0x31f
> [<ffffffff811355aa>] ? fget_light+0x17b/0x31f
> [<ffffffff8137ebb9>] ? retint_swapgs+0x13/0x1b
> [<ffffffff81143c6e>] sys_ioctl+0x56/0x7c
> [<ffffffff81002df2>] system_call_fastpath+0x16/0x1b
> Code: c7 31 db 47 8d 3c ff e9 1d 01 00 00 0f 0b 4c 89 f2 44 88 f9 48 d3 ea 81 e2 ff 01 00 00 41 83 fd 01 48 8d 1c d0 0f 84 0b 01 00 00 <f6> 03 03 0f 85 d8 00 00 00 41 8b 44 24 04 85 c0 79 08 65 8b 04
> RIP [<ffffffff8121de8c>] pfn_to_dma_pte+0x73/0x190
> RSP <ffff8800bd4bdb68>
>
>
> Can anyone parse this? Is it Intel-specific or a generic issue? The
> kernel is current kvm.git + unrelated patch + [1].
>
> Jan
>
> [1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/61923
>


Attachments:
signature.asc (259.00 B)
OpenPGP digital signature

2010-11-01 13:20:37

by Joerg Roedel

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

The registers rax and rbx contain non-canonical addresses (if
interpreted as pointers). The instruction where this happens is a mov so
I guess that the #GP is because of an non-canonical address.
Can you find out the code-line where this happens and the exact
assembler instruction? (haven't managed to decode the registers used).

Joerg

On Mon, Nov 01, 2010 at 08:57:42AM -0400, Jan Kiszka wrote:
> [ Forgot to CC LKML - maybe it's not KVM-specific.
> BTW, is anyone actually using current KVM device assigment on
> Intel? I'm starting to believe that can only very few lucky people...
> ]
>
> Am 01.11.2010 13:51, Jan Kiszka wrote:
> > Hi again,
> >
> > OK, I swapped those two lines in intel_iommu_attach_device [1], fixed
> > another warning in the wbinvd emulation, but now I'm about to give up.
> > This is freaky MMU stuff:
> >
> >
> > general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > last sysfs file: /sys/devices/pci0000:00/0000:00:1a.0/device
> > CPU 1
> > Modules linked in: kvm_intel kvm bluetooth snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device edd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec ath9k_common ath9k_hw snd_hwdep snd_pcm snd_timer snd ath pcmcia cfg80211 sdhci_pci tpm_infineon sdhci yenta_socket firewire_ohci tpm_tis mmc_core soundcore e1000e sg pcmcia_rsrc tpm firewire_core iTCO_wdt video snd_page_alloc pcmcia_core i2c_i801 rfkill tpm_bios intel_agp output fujitsu_laptop iTCO_vendor_support i2c_core serio_raw pcspkr led_class joydev crc_itu_t button battery intel_gtt ac ext4 mbcache jbd2 crc16 sha256_generic aes_x86_64 aes_generic cbc dm_crypt linear sd_mod crc_t10dif dm_snapshot dm_mod fan processor ahci libahci ata_gene
> ri
> > c liba
> > ta scsi_mod thermal thermal_sys hwmon
> > Nov 1 13:19:11 mchn199C kernel:
> > Pid: 2248, comm: qemu-system-x86 Not tainted 2.6.36+ #12 FJNB211W/CELSIUS H700
> > RIP: 0010:[<ffffffff8121de8c>] [<ffffffff8121de8c>] pfn_to_dma_pte+0x73/0x190
> > RSP: 0018:ffff8800bd4bdb68 EFLAGS: 00010202
> > RAX: ffff1000bd4fe000 RBX: ffff1000bd4fec00 RCX: 0000000000000009
> > RDX: 0000000000000180 RSI: ffff88012a940938 RDI: 0000000000000202
> > RBP: ffff8800bd4bdba8 R08: ffffea00025ac2a0 R09: 0000000000000004
> > R10: 0000000000000001 R11: 0000000000000000 R12: ffff880128dfee00
> > R13: 0000000000000002 R14: 00000000000f0000 R15: 0000000000000009
> > FS: 00007f4990d33710(0000) GS:ffff8800be680000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> > CR2: 0000000001408000 CR3: 00000000bd7db000 CR4: 00000000000026e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Process qemu-system-x86 (pid: 2248, threadinfo ffff8800bd4bc000, task ffff88012a940000)
> > Stack:
> > ffff8800bd4bdb88 ffff8800ac378000 00000000000000d2 00000000000f0000
> > <0> ffff8800bd4bdce8 ffff8800ac2fc000 0000000000000003 00000000000f0001
> > <0> ffff8800bd4bdbb8 ffffffff8121dfbe ffff8800bd4bdbc8 ffffffff812a9573
> > Call Trace:
> > [<ffffffff8121dfbe>] intel_iommu_iova_to_phys+0x15/0x2a
> > [<ffffffff812a9573>] iommu_iova_to_phys+0x13/0x15
> > [<ffffffffa04e91d0>] kvm_iommu_map_pages+0x77/0x194 [kvm]
> > [<ffffffff8111404d>] ? __vmalloc_node+0x86/0x9b
> > [<ffffffffa04e30e2>] __kvm_set_memory_region+0x4e5/0x787 [kvm]
> > [<ffffffff81081ee8>] ? mark_held_locks+0x50/0x72
> > [<ffffffff8137c983>] ? mutex_lock_nested+0x325/0x34d
> > [<ffffffffa04e33bb>] kvm_set_memory_region+0x37/0x50 [kvm]
> > [<ffffffffa04e4c15>] kvm_vm_ioctl_set_memory_region+0x18/0x1a [kvm]
> > [<ffffffffa04e4e44>] kvm_vm_ioctl+0x22d/0x3b1 [kvm]
> > [<ffffffff811355aa>] ? fget_light+0x17b/0x31f
> > [<ffffffff81143bd7>] do_vfs_ioctl+0x4c6/0x507
> > [<ffffffff81135732>] ? fget_light+0x303/0x31f
> > [<ffffffff811355aa>] ? fget_light+0x17b/0x31f
> > [<ffffffff8137ebb9>] ? retint_swapgs+0x13/0x1b
> > [<ffffffff81143c6e>] sys_ioctl+0x56/0x7c
> > [<ffffffff81002df2>] system_call_fastpath+0x16/0x1b
> > Code: c7 31 db 47 8d 3c ff e9 1d 01 00 00 0f 0b 4c 89 f2 44 88 f9 48 d3 ea 81 e2 ff 01 00 00 41 83 fd 01 48 8d 1c d0 0f 84 0b 01 00 00 <f6> 03 03 0f 85 d8 00 00 00 41 8b 44 24 04 85 c0 79 08 65 8b 04
> > RIP [<ffffffff8121de8c>] pfn_to_dma_pte+0x73/0x190
> > RSP <ffff8800bd4bdb68>
> >
> >
> > Can anyone parse this? Is it Intel-specific or a generic issue? The
> > kernel is current kvm.git + unrelated patch + [1].
> >
> > Jan
> >
> > [1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/61923
> >
>



--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

2010-11-01 13:25:05

by Jan Kiszka

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

Am 01.11.2010 14:21, Roedel, Joerg wrote:
> The registers rax and rbx contain non-canonical addresses (if
> interpreted as pointers). The instruction where this happens is a mov so
> I guess that the #GP is because of an non-canonical address.
> Can you find out the code-line where this happens and the exact
> assembler instruction? (haven't managed to decode the registers used).

In pfn_to_dma_pte, line 710:

if (!dma_pte_present(pte)) {
ffffffff8121de8c: f6 03 03 testb $0x3,(%rbx)
ffffffff8121de8f: 0f 85 d8 00 00 00 jne ffffffff8121df6d <pfn_to_dma_pte+0x154>

The first instruction raises the fault.

Jan

>
> Joerg
>
> On Mon, Nov 01, 2010 at 08:57:42AM -0400, Jan Kiszka wrote:
>> [ Forgot to CC LKML - maybe it's not KVM-specific.
>> BTW, is anyone actually using current KVM device assigment on
>> Intel? I'm starting to believe that can only very few lucky people...
>> ]
>>
>> Am 01.11.2010 13:51, Jan Kiszka wrote:
>>> Hi again,
>>>
>>> OK, I swapped those two lines in intel_iommu_attach_device [1], fixed
>>> another warning in the wbinvd emulation, but now I'm about to give up.
>>> This is freaky MMU stuff:
>>>
>>>
>>> general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>>> last sysfs file: /sys/devices/pci0000:00/0000:00:1a.0/device
>>> CPU 1
>>> Modules linked in: kvm_intel kvm bluetooth snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device edd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec ath9k_common ath9k_hw snd_hwdep snd_pcm snd_timer snd ath pcmcia cfg80211 sdhci_pci tpm_infineon sdhci yenta_socket firewire_ohci tpm_tis mmc_core soundcore e1000e sg pcmcia_rsrc tpm firewire_core iTCO_wdt video snd_page_alloc pcmcia_core i2c_i801 rfkill tpm_bios intel_agp output fujitsu_laptop iTCO_vendor_support i2c_core serio_raw pcspkr led_class joydev crc_itu_t button battery intel_gtt ac ext4 mbcache jbd2 crc16 sha256_generic aes_x86_64 aes_generic cbc dm_crypt linear sd_mod crc_t10dif dm_snapshot dm_mod fan processor ahci libahci ata_ge
ne
>> ri
>>> c liba
>>> ta scsi_mod thermal thermal_sys hwmon
>>> Nov 1 13:19:11 mchn199C kernel:
>>> Pid: 2248, comm: qemu-system-x86 Not tainted 2.6.36+ #12 FJNB211W/CELSIUS H700
>>> RIP: 0010:[<ffffffff8121de8c>] [<ffffffff8121de8c>] pfn_to_dma_pte+0x73/0x190
>>> RSP: 0018:ffff8800bd4bdb68 EFLAGS: 00010202
>>> RAX: ffff1000bd4fe000 RBX: ffff1000bd4fec00 RCX: 0000000000000009
>>> RDX: 0000000000000180 RSI: ffff88012a940938 RDI: 0000000000000202
>>> RBP: ffff8800bd4bdba8 R08: ffffea00025ac2a0 R09: 0000000000000004
>>> R10: 0000000000000001 R11: 0000000000000000 R12: ffff880128dfee00
>>> R13: 0000000000000002 R14: 00000000000f0000 R15: 0000000000000009
>>> FS: 00007f4990d33710(0000) GS:ffff8800be680000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
>>> CR2: 0000000001408000 CR3: 00000000bd7db000 CR4: 00000000000026e0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Process qemu-system-x86 (pid: 2248, threadinfo ffff8800bd4bc000, task ffff88012a940000)
>>> Stack:
>>> ffff8800bd4bdb88 ffff8800ac378000 00000000000000d2 00000000000f0000
>>> <0> ffff8800bd4bdce8 ffff8800ac2fc000 0000000000000003 00000000000f0001
>>> <0> ffff8800bd4bdbb8 ffffffff8121dfbe ffff8800bd4bdbc8 ffffffff812a9573
>>> Call Trace:
>>> [<ffffffff8121dfbe>] intel_iommu_iova_to_phys+0x15/0x2a
>>> [<ffffffff812a9573>] iommu_iova_to_phys+0x13/0x15
>>> [<ffffffffa04e91d0>] kvm_iommu_map_pages+0x77/0x194 [kvm]
>>> [<ffffffff8111404d>] ? __vmalloc_node+0x86/0x9b
>>> [<ffffffffa04e30e2>] __kvm_set_memory_region+0x4e5/0x787 [kvm]
>>> [<ffffffff81081ee8>] ? mark_held_locks+0x50/0x72
>>> [<ffffffff8137c983>] ? mutex_lock_nested+0x325/0x34d
>>> [<ffffffffa04e33bb>] kvm_set_memory_region+0x37/0x50 [kvm]
>>> [<ffffffffa04e4c15>] kvm_vm_ioctl_set_memory_region+0x18/0x1a [kvm]
>>> [<ffffffffa04e4e44>] kvm_vm_ioctl+0x22d/0x3b1 [kvm]
>>> [<ffffffff811355aa>] ? fget_light+0x17b/0x31f
>>> [<ffffffff81143bd7>] do_vfs_ioctl+0x4c6/0x507
>>> [<ffffffff81135732>] ? fget_light+0x303/0x31f
>>> [<ffffffff811355aa>] ? fget_light+0x17b/0x31f
>>> [<ffffffff8137ebb9>] ? retint_swapgs+0x13/0x1b
>>> [<ffffffff81143c6e>] sys_ioctl+0x56/0x7c
>>> [<ffffffff81002df2>] system_call_fastpath+0x16/0x1b
>>> Code: c7 31 db 47 8d 3c ff e9 1d 01 00 00 0f 0b 4c 89 f2 44 88 f9 48 d3 ea 81 e2 ff 01 00 00 41 83 fd 01 48 8d 1c d0 0f 84 0b 01 00 00 <f6> 03 03 0f 85 d8 00 00 00 41 8b 44 24 04 85 c0 79 08 65 8b 04
>>> RIP [<ffffffff8121de8c>] pfn_to_dma_pte+0x73/0x190
>>> RSP <ffff8800bd4bdb68>
>>>
>>>
>>> Can anyone parse this? Is it Intel-specific or a generic issue? The
>>> kernel is current kvm.git + unrelated patch + [1].
>>>
>>> Jan
>>>
>>> [1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/61923
>>>
>>
>
>
>



Attachments:
signature.asc (259.00 B)
OpenPGP digital signature

2010-11-01 13:52:34

by Joerg Roedel

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

On Mon, Nov 01, 2010 at 09:25:00AM -0400, Jan Kiszka wrote:
> Am 01.11.2010 14:21, Roedel, Joerg wrote:
> > The registers rax and rbx contain non-canonical addresses (if
> > interpreted as pointers). The instruction where this happens is a mov so
> > I guess that the #GP is because of an non-canonical address.
> > Can you find out the code-line where this happens and the exact
> > assembler instruction? (haven't managed to decode the registers used).
>
> In pfn_to_dma_pte, line 710:
>
> if (!dma_pte_present(pte)) {
> ffffffff8121de8c: f6 03 03 testb $0x3,(%rbx)
> ffffffff8121de8f: 0f 85 d8 00 00 00 jne ffffffff8121df6d <pfn_to_dma_pte+0x154>
>
> The first instruction raises the fault.

Ok, so it seems that my understanding of the Code: field in the
crash-message was wrong :)
Anyway, the testb uses rbx as an address which has a non-canonical
value. This means the the address of 'pte' is invalid. Since rax also
contains a wrong address the 'parent' variable probably already contains
the wrong address. Does the attached patch help?

diff --git a/include/linux/dma_remapping.h b/include/linux/dma_remapping.h
index 5619f85..ca46f24 100644
--- a/include/linux/dma_remapping.h
+++ b/include/linux/dma_remapping.h
@@ -6,7 +6,7 @@
*/
#define VTD_PAGE_SHIFT (12)
#define VTD_PAGE_SIZE (1UL << VTD_PAGE_SHIFT)
-#define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT)
+#define VTD_PAGE_MASK ((((u64)-1) << VTD_PAGE_SHIFT) & ((1ULL << 52) - 1))
#define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK)

#define DMA_PTE_READ (1)

--
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

2010-11-01 14:22:30

by Jan Kiszka

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

Am 01.11.2010 14:53, Roedel, Joerg wrote:
> On Mon, Nov 01, 2010 at 09:25:00AM -0400, Jan Kiszka wrote:
>> Am 01.11.2010 14:21, Roedel, Joerg wrote:
>>> The registers rax and rbx contain non-canonical addresses (if
>>> interpreted as pointers). The instruction where this happens is a mov so
>>> I guess that the #GP is because of an non-canonical address.
>>> Can you find out the code-line where this happens and the exact
>>> assembler instruction? (haven't managed to decode the registers used).
>>
>> In pfn_to_dma_pte, line 710:
>>
>> if (!dma_pte_present(pte)) {
>> ffffffff8121de8c: f6 03 03 testb $0x3,(%rbx)
>> ffffffff8121de8f: 0f 85 d8 00 00 00 jne ffffffff8121df6d <pfn_to_dma_pte+0x154>
>>
>> The first instruction raises the fault.
>
> Ok, so it seems that my understanding of the Code: field in the
> crash-message was wrong :)
> Anyway, the testb uses rbx as an address which has a non-canonical
> value. This means the the address of 'pte' is invalid. Since rax also
> contains a wrong address the 'parent' variable probably already contains
> the wrong address. Does the attached patch help?
>
> diff --git a/include/linux/dma_remapping.h b/include/linux/dma_remapping.h
> index 5619f85..ca46f24 100644
> --- a/include/linux/dma_remapping.h
> +++ b/include/linux/dma_remapping.h
> @@ -6,7 +6,7 @@
> */
> #define VTD_PAGE_SHIFT (12)
> #define VTD_PAGE_SIZE (1UL << VTD_PAGE_SHIFT)
> -#define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT)
> +#define VTD_PAGE_MASK ((((u64)-1) << VTD_PAGE_SHIFT) & ((1ULL << 52) - 1))
> #define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK)
>
> #define DMA_PTE_READ (1)
>

Crashes during early boot while initializing dmar. If you need the
trace, I could set up some debug console.

Jan


Attachments:
signature.asc (259.00 B)
OpenPGP digital signature

2010-11-01 14:35:30

by Joerg Roedel

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

On Mon, Nov 01, 2010 at 03:22:15PM +0100, Jan Kiszka wrote:
> Am 01.11.2010 14:53, Roedel, Joerg wrote:
> > On Mon, Nov 01, 2010 at 09:25:00AM -0400, Jan Kiszka wrote:
> >> Am 01.11.2010 14:21, Roedel, Joerg wrote:
> >>> The registers rax and rbx contain non-canonical addresses (if
> >>> interpreted as pointers). The instruction where this happens is a mov so
> >>> I guess that the #GP is because of an non-canonical address.
> >>> Can you find out the code-line where this happens and the exact
> >>> assembler instruction? (haven't managed to decode the registers used).
> >>
> >> In pfn_to_dma_pte, line 710:
> >>
> >> if (!dma_pte_present(pte)) {
> >> ffffffff8121de8c: f6 03 03 testb $0x3,(%rbx)
> >> ffffffff8121de8f: 0f 85 d8 00 00 00 jne ffffffff8121df6d <pfn_to_dma_pte+0x154>
> >>
> >> The first instruction raises the fault.
> >
> > Ok, so it seems that my understanding of the Code: field in the
> > crash-message was wrong :)
> > Anyway, the testb uses rbx as an address which has a non-canonical
> > value. This means the the address of 'pte' is invalid. Since rax also
> > contains a wrong address the 'parent' variable probably already contains
> > the wrong address. Does the attached patch help?
> >
> > diff --git a/include/linux/dma_remapping.h b/include/linux/dma_remapping.h
> > index 5619f85..ca46f24 100644
> > --- a/include/linux/dma_remapping.h
> > +++ b/include/linux/dma_remapping.h
> > @@ -6,7 +6,7 @@
> > */
> > #define VTD_PAGE_SHIFT (12)
> > #define VTD_PAGE_SIZE (1UL << VTD_PAGE_SHIFT)
> > -#define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT)
> > +#define VTD_PAGE_MASK ((((u64)-1) << VTD_PAGE_SHIFT) & ((1ULL << 52) - 1))
> > #define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK)
> >
> > #define DMA_PTE_READ (1)
> >
>
> Crashes during early boot while initializing dmar. If you need the
> trace, I could set up some debug console.

Hmm, no. This was only a guess. The VTD_PAGE_MASK does not mask out the
bits 52-63 of the pte. According to the VT-d spec it is allowed to set
these bits, some are marked as AVL and some have special meanings. If a
pte has one of these bits set the phys_addr calculated will be wrong and
the virt_addr calculated from it too (probably non-canonical, leading to
the GPF).

Probably masking out these bits in dma_pte_addr helps.

Joerg

2010-11-01 15:30:51

by Jan Kiszka

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

Am 01.11.2010 15:35, Joerg Roedel wrote:
> On Mon, Nov 01, 2010 at 03:22:15PM +0100, Jan Kiszka wrote:
>> Am 01.11.2010 14:53, Roedel, Joerg wrote:
>>> On Mon, Nov 01, 2010 at 09:25:00AM -0400, Jan Kiszka wrote:
>>>> Am 01.11.2010 14:21, Roedel, Joerg wrote:
>>>>> The registers rax and rbx contain non-canonical addresses (if
>>>>> interpreted as pointers). The instruction where this happens is a mov so
>>>>> I guess that the #GP is because of an non-canonical address.
>>>>> Can you find out the code-line where this happens and the exact
>>>>> assembler instruction? (haven't managed to decode the registers used).
>>>>
>>>> In pfn_to_dma_pte, line 710:
>>>>
>>>> if (!dma_pte_present(pte)) {
>>>> ffffffff8121de8c: f6 03 03 testb $0x3,(%rbx)
>>>> ffffffff8121de8f: 0f 85 d8 00 00 00 jne ffffffff8121df6d <pfn_to_dma_pte+0x154>
>>>>
>>>> The first instruction raises the fault.
>>>
>>> Ok, so it seems that my understanding of the Code: field in the
>>> crash-message was wrong :)
>>> Anyway, the testb uses rbx as an address which has a non-canonical
>>> value. This means the the address of 'pte' is invalid. Since rax also
>>> contains a wrong address the 'parent' variable probably already contains
>>> the wrong address. Does the attached patch help?
>>>
>>> diff --git a/include/linux/dma_remapping.h b/include/linux/dma_remapping.h
>>> index 5619f85..ca46f24 100644
>>> --- a/include/linux/dma_remapping.h
>>> +++ b/include/linux/dma_remapping.h
>>> @@ -6,7 +6,7 @@
>>> */
>>> #define VTD_PAGE_SHIFT (12)
>>> #define VTD_PAGE_SIZE (1UL << VTD_PAGE_SHIFT)
>>> -#define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT)
>>> +#define VTD_PAGE_MASK ((((u64)-1) << VTD_PAGE_SHIFT) & ((1ULL << 52) - 1))
>>> #define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) & VTD_PAGE_MASK)
>>>
>>> #define DMA_PTE_READ (1)
>>>
>>
>> Crashes during early boot while initializing dmar. If you need the
>> trace, I could set up some debug console.
>
> Hmm, no. This was only a guess. The VTD_PAGE_MASK does not mask out the
> bits 52-63 of the pte. According to the VT-d spec it is allowed to set
> these bits, some are marked as AVL and some have special meanings. If a
> pte has one of these bits set the phys_addr calculated will be wrong and
> the virt_addr calculated from it too (probably non-canonical, leading to
> the GPF).
>
> Probably masking out these bits in dma_pte_addr helps.
>

Nope. But I just noticed a fatal thinko in my fix to
intel_iommu_attach_device - probably that was the key. Need to boot the
test kernel...

Jan


Attachments:
signature.asc (259.00 B)
OpenPGP digital signature

2010-11-01 16:37:58

by Jan Kiszka

[permalink] [raw]
Subject: Re: Crash on kvm_iommu_map_pages

Am 01.11.2010 16:29, Jan Kiszka wrote:
> Nope. But I just noticed a fatal thinko in my fix to
> intel_iommu_attach_device - probably that was the key. Need to boot the
> test kernel...

That was indeed the reason for this GPF: I blindly swapped the
problematic lines, releasing the wrong page. Sorry, false alarm this
time, will send out the corrected intel_iommu_attach_device fix later.

Jan


Attachments:
signature.asc (259.00 B)
OpenPGP digital signature