Received: by 10.213.65.68 with SMTP id h4csp1882205imn; Thu, 5 Apr 2018 05:31:47 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/6pYGPA5pmdAP7672sMn7tsgbJ6frBt0S77P2bwETe9pJHGp9bxeuFpOh+yxT0IHaaRXzR X-Received: by 2002:a17:902:7245:: with SMTP id c5-v6mr22702991pll.217.1522931507415; Thu, 05 Apr 2018 05:31:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522931507; cv=none; d=google.com; s=arc-20160816; b=Q/Q1+Pb4fypoQK8b/HHufbrt8KQoCnACiDRANqVfO/LCB2BpA24w6vmnLrZhrcq7Rv Aeu9KdK6MumISbcbK+rKWVvlMaePwOGl3U/XlXn6CWhx3pcFZTUTb13i/7F4iGjhoYR7 JERfeDXw/8xoDzBNQNCF1JhAgI2zImp7Sdtx2WbkHgNqv8ONcDVWZGElXxS+70OQNfTu XEFS93uqieVvDxL+tBF4znQNGFEH3wKt3oF+4m8rCDsiUJytiM1bvlu3RRhamVKeEOSB v0BegR3IWjk5HupiAWSSbLiHkaVjZho0fZU6YpMUII2jSYuU4TQHfyWQfPF3nzbTND3E /T+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=50tCFhxUgYncQ6rZR8i82TWDOigepsq+OvLam6wnD/k=; b=a3O4YEVFY8b7UGwV4atq7/oUebLVcJpUb1bz3l6cJyx+ug0bobRHxjH4BGk3sTjiqP TefAR24DSJAcmwOurL2HUrHep9OaLjGUFvEEtHiD/mq865u/5QuTsrij9f4bel2ZW/3B NJeTxy2G2SqtPKueAr4bKaiwIjhqCKF6OD9BefC8FonyJ66amZuXxm1cpsXQak2BjotF Ks77z9boe8LgolfhtyasgWJGP200dXgk19WDbcXm1DTs983x3sRbPTXSePQFOLhXZKFr G0MSVelNuwgsrwv5eonc84lNUfPtf/RWvDsEUZI0F9RWYhmPTtV4DgkLyaQ1yP0xMWfN pMPQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z7si5338438pgv.447.2018.04.05.05.31.33; Thu, 05 Apr 2018 05:31:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751469AbeDEMaD (ORCPT + 99 others); Thu, 5 Apr 2018 08:30:03 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:37579 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750997AbeDEM37 (ORCPT ); Thu, 5 Apr 2018 08:29:59 -0400 Received: from hsi-kbw-5-158-153-52.hsi19.kabel-badenwuerttemberg.de ([5.158.153.52] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1f441d-0001OT-GD; Thu, 05 Apr 2018 14:29:57 +0200 Date: Thu, 5 Apr 2018 14:29:57 +0200 (CEST) From: Thomas Gleixner To: Dexuan Cui cc: 'Greg KH' , "Michael Kelley (EOSG)" , "linux-kernel@vger.kernel.org" , KY Srinivasan , Stephen Hemminger , Vitaly Kuznetsov Subject: RE: Any standard kernel API to dynamically allocate/free per-cpu vectors on x86? In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dexuan, On Wed, 4 Apr 2018, Dexuan Cui wrote: > > From: Thomas Gleixner > > That needs a very simple and minimal virtual interrupt controller driver > > which is mostly a dummy implementation except for the activation function > > which would allow you to retrieve the vector number and store it in the > > MSR. > > Can you please give a little more guidance? e.g. is there any similar driver, > any pointer to the required APIs, etc. > I guess I need to dig into stuff like > struct irq_domain_ops x86_vector_domain_ops and request_percpu_irq(). request_percpu_irq() is not applicable here. That's a different mechanism which is used on ARM and others for PerProcessorInterrupts which have a single virtual irq number which maps to a single hardware vector number which is identical on all CPUs. We could make it work for x86, but then we are back to the point where we need the same vector on all CPUs with all the pain that involves. > Your quick pointer would help a lot! > > > There are a few details to be hashed out vs. CPU hotplug, but we have all > > the infrastructure in place to deal with that. > Sounds great! > > BTW, so far, Hyper-V doesn't support CPU hotplug, but it supports dynamic > CPU online/offline . I guess I must also consider CPU online/offline here. Yes, that's all covered. The trick is to use the affinity managed interrupt facility for these per cpu interrupts and then the cpu online/offline case including physical(virtual) hotplug is dealt with automagically. You request the irqs once with request_irq() and they stay requested for the life time. No action required on the driver side for CPU online/offline events vs. the interrupt. Find below a hastily cobbled together minimal starting point. You need to fill in the gaps by looking at similar implementations: ioapic for some stuff and the way simpler UV code in x86/platform/uv/uv_irq.c for most of it. Hope that helps. If you have questions or run into limitations, feel free to ask. Thanks, tglx static struct irqdomain *hyperv_synic_domain; static struct irq_chip hyperv_synic = { .name = "HYPERV-SYNIC", .irq_mask = hv_noop, .irq_unmask = hv_noop, .irq_eoi = hv_ack_apic, .irq_set_affinity = irq_chip_set_affinity_parent, }; static int hyperv_irqdomain_activate(..,irqd,) { struct irq_cfg *cfg = irqd_cfg(irqd); /* * cfg gives you access to the destination apicid and the vector * number. If you need the CPU number as well, then you can either * retrieve it from the effective affinity cpumask which you can * access with irq_data_get_effective_affinity_mask(irqd) or we * can extend irq_cfg to hold the target CPU number (would be a * trivial thing to do). So this is the place where you can store * the vector number. */ } /* * activate is called when: * - the interrupt is requested via request_irq() * - the interrupt is restarted on a cpu online event * (CPUHP_AP_IRQ_AFFINITY_ONLINE) * * deactivate is called when: * - the interrupt is freed via free_irq() * - the interrupt is shut down on a cpu offline event shortly * before the outgoing CPU dies (irq_migrate_all_off_this_cpu()). */ static const struct irq_domain_ops hyperv_irqdomain_ops = { .alloc = hyperv_irqdomain_alloc, .free = hyperv_irqdomain_free, .activate = hyperv_irqdomain_activate, .deactivate = hyperv_irqdomain_deactivate, }; static int hyperv_synic_setup(void) { struct fwnode_handle *fwnode; struct irqdomain *d; void *host_data = NULL; fwnode = irq_domain_alloc_named_fwnode("HYPERV-SYNIC"); d = irq_domain_create_hierarchy(x86_vector_domain, 0, 0, fwnode, &hyperv_irqdomain_ops, host_data); hyperv_synic_domain = d; return 0; } int hyperv_alloc_vmbus_irq(...) { struct irq_affinity affd = { 0, }; struct irq_alloc_info info; struct cpumask *masks; int ret, nvec, i; init_irq_alloc_info(&info, NULL); /* You can add HV specific fields in info to transport data */ info.type = X86_IRQ_ALLOC_TYPE_HV_SYNIC; nvec = num_possible_cpus(); /* * Create an array of affinity masks which are spread out * over all possible cpus. */ masks = irq_create_affinity_masks(nvec, &affd); /* * Allocate the interrupts which are affined to the affinity masks * in the masks array and marked as managed. */ virq = __irq_domain_alloc_irqs(hyperv_synic_domain, -1, nvec, NUMA_NO_NODE, &info, false, masks); kfree(masks); /* * This returns the base irq number. The per cpu interrupt is * simply: virq + CPUNR, if the CPU space is linear. If there are * holes in the cpu_possible_mask, then you need more magic. * * On the call site you simply do: * for (i = 0; i < nvec; i++) * request_irq(virq + i, ......); */ return virq; }