Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934894AbcCORJ7 (ORCPT ); Tue, 15 Mar 2016 13:09:59 -0400 Received: from mail-bl2on0079.outbound.protection.outlook.com ([65.55.169.79]:39003 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933150AbcCORJ4 (ORCPT ); Tue, 15 Mar 2016 13:09:56 -0400 Authentication-Results: amd.com; dkim=none (message not signed) header.d=none;amd.com; dmarc=none action=none header.from=amd.com; Subject: Re: [PART1 RFC v2 05/10] KVM: x86: Detect and Initialize AVIC support To: Paolo Bonzini , , , , , References: <1457124368-2025-1-git-send-email-Suravee.Suthikulpanit@amd.com> <1457124368-2025-6-git-send-email-Suravee.Suthikulpanit@amd.com> <56DDAF2D.3060703@redhat.com> CC: , , , From: Suravee Suthikulpanit Message-ID: <56E841CC.4090806@amd.com> Date: Wed, 16 Mar 2016 00:09:32 +0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56DDAF2D.3060703@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [124.121.8.20] X-ClientProxiedBy: SIXPR04CA0089.apcprd04.prod.outlook.com (10.141.119.47) To SN1PR12MB0447.namprd12.prod.outlook.com (25.162.105.140) X-MS-Office365-Filtering-Correlation-Id: 80a004da-3be9-42ea-2686-08d34cf4a020 X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0447;2:0WGsXngueGZAwyVTA7sjUV+QuMSOEBD9VrLkTlv6YxLSqn8ZCnoFcLlmgWN/Hlnz3BlLR73hIG/7U2sBGZn4dTXLfSsJOZL3/tfnpCIgnnMV/P0JH/SveKEtBF3xsG8xdiNqVrdlRXR4E+dUNHKHdvpI0Rb9aXxQvr70gHcGSZPUOVlQ3q49RtsOxr/8rjBG;3:Yi0rBCIFtpskJ93WjN68M+BCW7uWCKF6Ew2OTBNdixvL4y+lgQw7KqnaAv8NI6NLPnqXTD/PrHwbshZpHakYnTtxalvNssGinX3KXP8jgnKkpwdgFxpa3kx2JK6euMtb;25:y1b+ynIXT1FnSrYFg1DqV2WBzRbuIVxoc3VVbxiM8bPieZ3/lKOdA3pveQ/RxcO/1jwVKoiR6EqRqLFkBzI3Df8WDRHzupda7dvlGN+uJEZevD1kDUhneSSj0MPFo6+ENe1ypJyBkx5+hsS7pjwkr8whsgyQVllq0WrDgTQgQjl0l1nOTXC8wwq4muDMDEPcDaaD1BEhxxC54lZPFNy7nRrIpn61kaIkBuLbPK+lhdF+T5bi78MlSOL9QrJiEqti2MqOXulkodiXsASHJ/XXh3M9e/9jzm/CIXzE+jaUcQFi2uhGzmLN7stZt5CwNjDF4oKUv6ORfuMPy7Q6mC08kQ== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0447; X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0447;20:PZN0ORW1W9ucMApAmCXdLV4oALVadCmaCGNkfZr3GjmxBZ0Mrc6F/VaG8y/ha6aGS26rlmRPb/E9FH7K0HyA6y9I93RVaSKfok/cTRVGWAG0trEE1uM7rMYY9ptCZP1AUuhLzAapY1Rj5IzFDtfi/GpXlJHoR3pEDsI1tR/pPWCV7QECXGzQxIBpW8i8q4gWgV/STlyd94RAIYjdMupSn2pZgFoT6Ix0zD2h2szf0h0CuPdrbcK9LYId/feuJrWHtOHwAPLPeDBV3Mjp7V9IZ8/wzc4r9aS6moKN7vtsyIBNc0uUJNQ6oGbrmwzRYTJh6HwpAKR860/eNJD8H195GkZMy7xgh436Kmw38whgYeejni3QrTTMhtxKs88UpVW3KQzRe3HjdamG3JXPfjvTAKI+EYkleTaBn2KqRGjLo/Z0ZJFWm8LmEZwNyD+x3r9qz/otYbFqlIIkxma35ldPj9kyFCJuyE04ZgLa91A5fpAgP6FhjC6u1A4t/4JfjNP0;4:XX+q3qk3EOANFetZRhQ/fqV+AtkON7QCibHNIpGpSHOoMFDJZThjJ7mCdTqWsxpWrX3glfkTIZWBCf+BQdFxqwckwfMkoAD8YxVqCotypyJf5P/cGfwtdkNTIlCtt284Yw5TNcZwPj0jZBO45/AsHTFJCtM4f9Vk/S0OlkrNXyQoqg1NLL0fw5vSGwEJViKKqZCHSc7knAEQreo36dee2CpSnZv28lGoo9pcPU6dC1h3y5iqbLVevgWZyMXXWuUUnt2lExwGoSHNZs14hm06yzYMn6F4n7zZTJp60RS+Sx5XNwlbpCiB8x8xUoBk/GtLMvEqP3i1aSEUmsI9Xaz/RMo/YHn7lLSBRTdkyeXLbYwmV7WKQoOPdPxE1c+H3iGl X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);SRVR:SN1PR12MB0447;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0447; X-Forefront-PRVS: 08828D20BC X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(479174004)(164054003)(377454003)(24454002)(1096002)(230700001)(50466002)(3846002)(64126003)(586003)(4326007)(2906002)(6116002)(92566002)(54356999)(65816999)(76176999)(50986999)(81166005)(83506001)(80316001)(86362001)(19580395003)(33656002)(65806001)(65956001)(77096005)(36756003)(5008740100001)(66066001)(5004730100002)(47776003)(4001350100001)(189998001)(117156001)(23746002)(5001770100001)(2950100001)(42186005)(2201001);DIR:OUT;SFP:1101;SCL:1;SRVR:SN1PR12MB0447;H:[192.168.0.19];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;SN1PR12MB0447;23:J8nTFt9Rj7Dks1DBbH4luFdFTgTAGFa/Cxx3V?= =?Windows-1252?Q?e1fML4+aVvKfTXWTd6R7EccYSrvRvjISmTUF7wW/iZQcX70laeTfzfGy?= =?Windows-1252?Q?zruhk+uQeChMEK0EGHFuuy3jdC6Gr7s7eCi0FKhEk7YhG7XDZ6PY6ZKK?= =?Windows-1252?Q?CcK98amx5ZQdeMxWNqRttgv42lJG5n6u7y3R+7eXHRu1jEtkm7QYxS9C?= =?Windows-1252?Q?89z+BuAOJAywtiJ7Qvg9PQt1DvgivvZrsG7nHa82VZXO/QSW2NWLzOkh?= =?Windows-1252?Q?4qGHr79QHVf5bzJuSNOdEEedg9ySohcWSIZknAMlEgTiu+DzZjwToNRp?= =?Windows-1252?Q?WiGhpfzNzDNsw2duq/8+QDjU2LSAAn0piPF04LUPXlj9ZPNQoxiijIiV?= =?Windows-1252?Q?Hn0G6wAqsJL/k6PxgZnN8QYXutVmPzvFX4C0jW4rTcvwj8moQTfcwH77?= =?Windows-1252?Q?xYfJOzfQKTLG+svPdvHyFH3y+r6wfPSotIo8v03TRgZ7eiANfwnQ5VBM?= =?Windows-1252?Q?ufo/Sv95IwnPlO94vCbMbouY9PiaH419qURcZC12vg35+BCcYWqw/ea3?= =?Windows-1252?Q?sydoU6Vqgiw1WIuofKrWlOBb/z0gxf49U455TTu8E0xQqGnQ6sohtsvC?= =?Windows-1252?Q?bBNaci7lqlvWJRmm4jSFCqTfVbRH6cRKpe9qDpcV7xH4zPe7Gs3roZNC?= =?Windows-1252?Q?PavvyBcf4CpNefWIRQdHAaXH47YWnTRWCKlMktN2ROOsdzF7tBMZ3qMG?= =?Windows-1252?Q?MNS2e/Apgi+mEPCl7r5eolU5RbEZt31NVycFV21EGofkGsj3XFT4x6WB?= =?Windows-1252?Q?bzmN+sRtkAmQkrAV1TCFsFbskWYn5kueLLhRXkOUloEj4mJOr1dRbdnn?= =?Windows-1252?Q?zI+LWcajnqF3N7kt0lf1leFSBgANXJmFRJg7sgOXHxXUwcJB2yPJwKFS?= =?Windows-1252?Q?nBqFtp/wRtoCDKbYrFYlW0zbBWS44nX2ZP5qVBIg/jZLcOknKTVvO804?= =?Windows-1252?Q?czAEe25Fjrt9ijwWW6pX1WofdxLYx5mko2m7943Q2NiTXQJno+PASN2o?= =?Windows-1252?Q?vUSHdRGvYuV5fHwvAtbb28Er4rcE5wZX6EUUbYvT18OTf3UyGu7xB3hg?= =?Windows-1252?Q?bI2vQzgPq2ZGAr0KsgdcQ0F1tj4mGGH3edzDkCdsFCoGcBpVwO0rLKms?= =?Windows-1252?Q?5gYr6JbTGO8PIt8It9hS6oXhh7304P5lxtmA1LTsyXEEQ2CQwTm?= X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0447;5:D9ebwlNiFKKhcsvt9QIUaRJgUnczScDKoGtvP6Va3O+vMTFNsrT+yX5oXl9SeCHW/HAcLme/FN7aNg7vIw8XtUpKx+Y9SXiXuWuIQzkMY2XLQr3G8xB0RvJGN2H1DLg4fK8HiDNyh9cE+6BWdHEiqA==;24:b5tBP3+QuXHDrywkLgFL5E2YzAeE1px5MCGaaHtCbhUA0MNY5qRO+Iewz/Lys7HHMJ9/TROcnAx0Nl1uY+s//1RhY14nKGpWlmt/JaOKNDo=;20:mUgEyKjqNOXCL9JrTUvALasoN4e7pdxJkX4OIIaLycCXIIQRfHAGym+lec/tujoipkFHFB57LPq9YcJfEonF9z674GGQvTUFZxO4VGJkztvYZK5Fz6VXTbtYcUVtP8B/jeDOP0SyDY5uv0GUI2KTL67gj78UUe4gChS6t5q3tZ8uechVknWdG+TihpByECX+cWlTeNT9ErxXdUZdK1SE6Deqf4ib5zoHOsWgQHKQp11X92wkIqsw85ctpwyCM5NH X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Mar 2016 17:09:50.4063 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR12MB0447 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4149 Lines: 134 Hi On 03/07/2016 11:41 PM, Paolo Bonzini wrote: > On 04/03/2016 21:46, Suravee Suthikulpanit wrote: > > [....] >> +/* Note: This structure is per VM */ >> +struct svm_vm_data { >> + atomic_t count; >> + u32 ldr_mode; >> + u32 avic_max_vcpu_id; >> + u32 avic_tag; >> + >> + struct page *avic_log_ait_page; >> + struct page *avic_phy_ait_page; > > You can put these directly in kvm_arch. Do not use abbreviations: > > struct page *avic_logical_apic_id_table_page; > struct page *avic_physical_apic_id_table_page; > Actually, the reason I would like to introduce this per-arch specific structure is because I feel that it is easier to manage these processor-specific variable/data-structure. If we add all these directly into kvm_arch, which is shared b/w SVM and VMX, it is more difficult to tell which one is used in the different code base. >> [...] >> + memcpy(vapic_bkpg, svm->in_kernel_lapic_regs, PAGE_SIZE); >> + svm->vcpu.arch.apic->regs = vapic_bkpg; > > Can you explain the flipping logic, and why you cannot just use the > existing apic.regs? Please see "explanation 1" below. >> [...] >> +static struct svm_avic_phy_ait_entry * >> +avic_get_phy_ait_entry(struct kvm_vcpu *vcpu, int index) >> +{ >> + [.....] >> +} >> + >> +struct svm_avic_log_ait_entry * >> +avic_get_log_ait_entry(struct kvm_vcpu *vcpu, u8 mda, bool is_flat) >> +{ >> + [.....] >> +} > > Instead of these functions, create a complete function to handle APIC_ID > and APIC_LDR writes. Then use kmap/kunmap instead of page_address. > Ok. May I ask why we are against using page_address? I have see that used in several places in the code. >> [...] >> +static int avic_alloc_bk_page(struct vcpu_svm *svm, int id) >> +{ >> + int ret = 0, i; >> + bool realloc = false; >> + struct kvm_vcpu *vcpu; >> + struct kvm *kvm = svm->vcpu.kvm; >> + struct svm_vm_data *vm_data = kvm->arch.arch_data; >> + >> + mutex_lock(&kvm->slots_lock); >> + >> + /* Check if we have already allocated vAPIC backing >> + * page for this vCPU. If not, we need to realloc >> + * a new one and re-assign all other vCPU. >> + */ >> + if (kvm->arch.apic_access_page_done && >> + (id > vm_data->avic_max_vcpu_id)) { >> + kvm_for_each_vcpu(i, vcpu, kvm) >> + avic_unalloc_bk_page(vcpu); >> + >> + __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, >> + 0, 0); >> + realloc = true; >> + vm_data->avic_max_vcpu_id = 0; >> + } >> + >> + /* >> + * We are allocating vAPIC backing page >> + * upto the max vCPU ID >> + */ >> + if (id >= vm_data->avic_max_vcpu_id) { >> + ret = __x86_set_memory_region(kvm, >> + APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, >> + APIC_DEFAULT_PHYS_BASE, >> + PAGE_SIZE * (id + 1)); > > Why is this necessary? The APIC access page is a peculiarity of Intel > processors (and the special memslot for only needs to map 0xfee00000 to > 0xfee00fff; after that there is the MSI area). > Please see "explanation 1" below. >> [...] >> + if (ret) >> + goto out; >> + >> + vm_data->avic_max_vcpu_id = id; >> + } >> + >> + /* Reinit vAPIC backing page for exisinting vcpus */ >> + if (realloc) >> + kvm_for_each_vcpu(i, vcpu, kvm) >> + avic_init_bk_page(vcpu); > > Why is this necessary? Explanation 1: The current lapic regs page is allocated using get_zeroed_page(), which can be paged out. If I use these pages for AVIC backing pages, it seems to cause VM to slow down quite a bit due to a lot of page faults. Currently, the AVIC backing pages are acquired from __x86_set_memory region() with APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, which maps the pages for address 0xfee00000 and above for VM to use. I mostly grab this from the VMX implementation in alloc_apic_access_page(). However, the memslot requires specification of the size at the time when calling __x86_set_memory_region(). However, I can't seem to figure out where I can get the number of vcpus at the time when we creating VM. Therefore, I have to track the vcpu creation, and re-acquire larger memslot every time vcpu_create() is called. I was not sure if this is the right approach, any suggestion for this part. Thanks, Suravee