Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4934644imu; Tue, 8 Jan 2019 08:42:09 -0800 (PST) X-Google-Smtp-Source: ALg8bN5p8+FhTTq0BJXlJbk3F6h00RT8HfddfosWa7FL29NVgpNVRZB6t+yqg0BNNVQz3bnGqKgO X-Received: by 2002:a17:902:f20b:: with SMTP id gn11mr2355691plb.274.1546965728992; Tue, 08 Jan 2019 08:42:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1546965728; cv=none; d=google.com; s=arc-20160816; b=T/4PBPuP4ovWXn1XyegM129L/hFtVpXmKlc6ClGWc8VRBYVYuTrzfaOdsTQLXEGeDP atqcYdE/x1rHt2L7WPaSVbSaEywTXiTejmz/wS3G0Asv9bAbtb9e1dRCkg+wmRQaaViD Jj2vhwFg+vgaGZeZUrY6ST+KjEglFY0U8K06JuncSxX0gX4Bvj90SvdVrgd7rNH+hEGm rFEvn6Pmo4CVv1zdLa/xUYAsdFrDXVzksXkc5vOmLOwUU44RnLKaGdEdmJM292z1z0Lh e1JNLnSLi49iNtO8/vY8AxANjOSgb3tVUsSnfYl3gqSdUegj7ROQBEAkdd0s9kH28ehV 8exQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :organization:from:references:cc:to:subject:reply-to; bh=SRK/lcdi1YA/lwpdZ9YR3qdkat4Eu2DGr4uEKrQptr8=; b=p3ZdMXsPKjgkKpBgpCuQ6RACAzrTaZ5LkYA7zR8E91R3IL2iy+mNFWVAb5o5ad7Xn4 X2Fy1Q7xNXpcse5nSFNHUw7dZv4hdURGMYD28MOGxMjV+tPpbExSb2EIggGxq3JV7frF fztT4DA5a6XFi9hskXmA27SAkvJgI6Gb+3j2DcJLn7yai42vSMIUIiYGwv9DOa8rRoG8 +Xx1xp/V8OU0NIw9sP3ZfttzZgIFzH1J8Ud6oINujyef/0uh0DhcsFUWHge7QrQzOzFF 40FCMdt6Jm34LcJ1rj/Ol2+t3JwRA/ZGWmWQBo5E8x8aBPbOSEwDAzWyj1Bwwgu8NhpX ng7Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p23si9355239pgk.312.2019.01.08.08.41.53; Tue, 08 Jan 2019 08:42:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728356AbfAHPV0 (ORCPT + 99 others); Tue, 8 Jan 2019 10:21:26 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:34650 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727484AbfAHPV0 (ORCPT ); Tue, 8 Jan 2019 10:21:26 -0500 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id x08FDSDe067936 for ; Tue, 8 Jan 2019 10:21:24 -0500 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2pvvwgnaej-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 08 Jan 2019 10:21:24 -0500 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 8 Jan 2019 15:21:22 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 8 Jan 2019 15:21:20 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x08FLIs214221406 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 8 Jan 2019 15:21:18 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 95BB9A4059; Tue, 8 Jan 2019 15:21:18 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1692AA4055; Tue, 8 Jan 2019 15:21:18 +0000 (GMT) Received: from [9.152.98.101] (unknown [9.152.98.101]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 8 Jan 2019 15:21:18 +0000 (GMT) Reply-To: mimu@linux.ibm.com Subject: Re: [PATCH v5 13/15] KVM: s390: add function process_gib_alert_list() To: Halil Pasic Cc: KVM Mailing List , Linux-S390 Mailing List , linux-kernel@vger.kernel.org, Martin Schwidefsky , Heiko Carstens , Christian Borntraeger , Janosch Frank , David Hildenbrand , Cornelia Huck , Pierre Morel References: <20181219191756.57973-1-mimu@linux.ibm.com> <20181219191756.57973-14-mimu@linux.ibm.com> <20190108135919.18048dd4@oc2783563651> From: Michael Mueller Organization: IBM Date: Tue, 8 Jan 2019 16:21:17 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190108135919.18048dd4@oc2783563651> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 19010815-4275-0000-0000-000002FBFD94 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19010815-4276-0000-0000-0000380A0909 Message-Id: <7e4a5077-00f0-3a0f-e21a-5bbc2fa14b70@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-01-08_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901080124 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08.01.19 13:59, Halil Pasic wrote: > On Wed, 19 Dec 2018 20:17:54 +0100 > Michael Mueller wrote: > >> This function processes the Gib Alert List (GAL). It is required >> to run when either a gib alert interruption has been received or >> a gisa that is in the alert list is cleared or dropped. >> >> The GAL is build up by millicode, when the respective ISC bit is >> set in the Interruption Alert Mask (IAM) and an interruption of >> that class is observed. >> >> Signed-off-by: Michael Mueller >> --- >> arch/s390/kvm/interrupt.c | 140 ++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 140 insertions(+) >> >> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c >> index 48a93f5e5333..03e7ba4f215a 100644 >> --- a/arch/s390/kvm/interrupt.c >> +++ b/arch/s390/kvm/interrupt.c >> @@ -2941,6 +2941,146 @@ int kvm_s390_get_irq_state(struct kvm_vcpu *vcpu, __u8 __user *buf, int len) >> return n; >> } >> >> +static int __try_airqs_kick(struct kvm *kvm, u8 ipm) >> +{ >> + struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; >> + struct kvm_vcpu *vcpu = NULL, *kick_vcpu[MAX_ISC + 1]; >> + int online_vcpus = atomic_read(&kvm->online_vcpus); >> + u8 ioint_mask, isc_mask, kick_mask = 0x00; >> + int vcpu_id, kicked = 0; >> + >> + /* Loop over vcpus in WAIT state. */ >> + for (vcpu_id = find_first_bit(fi->idle_mask, online_vcpus); >> + /* Until all pending ISCs have a vcpu open for airqs. */ >> + (~kick_mask & ipm) && vcpu_id < online_vcpus; >> + vcpu_id = find_next_bit(fi->idle_mask, online_vcpus, vcpu_id)) { >> + vcpu = kvm_get_vcpu(kvm, vcpu_id); >> + if (psw_ioint_disabled(vcpu)) >> + continue; >> + ioint_mask = (u8)(vcpu->arch.sie_block->gcr[6] >> 24); >> + for (isc_mask = 0x80; isc_mask; isc_mask >>= 1) { >> + /* ISC pending in IPM ? */ >> + if (!(ipm & isc_mask)) >> + continue; >> + /* vcpu for this ISC already found ? */ >> + if (kick_mask & isc_mask) >> + continue; >> + /* vcpu open for airq of this ISC ? */ >> + if (!(ioint_mask & isc_mask)) >> + continue; >> + /* use this vcpu (for all ISCs in ioint_mask) */ >> + kick_mask |= ioint_mask; >> + kick_vcpu[kicked++] = vcpu; > > Assuming that the vcpu can/will take all ISCs it's currently open for > does not seem right. We kind of rely on this assumption here, or? My latest version of this routine already follows a different strategy. It looks for a horizontal distribution of pending ISCs among idle vcpus. > >> + } >> + } >> + >> + if (vcpu && ~kick_mask & ipm) >> + VM_EVENT(kvm, 4, "gib alert undeliverable isc mask >> 0x%02x", >> + ~kick_mask & ipm); >> + >> + for (vcpu_id = 0; vcpu_id < kicked; vcpu_id++) >> + kvm_s390_vcpu_wakeup(kick_vcpu[vcpu_id]); >> + >> + return (online_vcpus != 0) ? kicked : -ENODEV; >> +} >> + >> +static void __floating_airqs_kick(struct kvm *kvm) >> +{ >> + struct kvm_s390_float_interrupt *fi = &kvm->arch.float_int; >> + int online_vcpus, kicked; >> + u8 ipm_t0, ipm; >> + >> + /* Get IPM and return if clean, IAM has been restored. */ >> + ipm = get_ipm(kvm->arch.gisa, IRQ_FLAG_IAM); >> + if (!ipm) >> + return; >> +retry: >> + ipm_t0 = ipm; >> + >> + /* Try to kick some vcpus in WAIT state. */ >> + kicked = __try_airqs_kick(kvm, ipm); >> + if (kicked < 0) >> + return; >> + >> + /* Get IPM and return if clean, IAM has been restored. */ >> + ipm = get_ipm(kvm->arch.gisa, IRQ_FLAG_IAM); >> + if (!ipm) >> + return; >> + >> + /* Start over, if new ISC bits are pending in IPM. */ >> + if ((ipm_t0 ^ ipm) & ~ipm_t0) >> + goto retry; >> + > > > >> + /* >> + * Return as we just kicked at least one vcpu in WAIT state >> + * open for airqs. The IAM will be restored latest when one >> + * of them goes into WAIT or STOP state. >> + */ >> + if (kicked > 0) >> + return; > > > >> + >> + /* >> + * No vcpu was kicked either because no vcpu was in WAIT state >> + * or none of the vcpus in WAIT state are open for airqs. >> + * Return immediately if no vcpus are in WAIT state. >> + * There are vcpus in RUN state. They will process the airqs >> + * if not closed for airqs as well. In that case the system will >> + * delay airqs until a vcpu decides to take airqs again. >> + */ >> + online_vcpus = atomic_read(&kvm->online_vcpus); >> + if (!bitmap_weight(fi->idle_mask, online_vcpus)) >> + return; >> + >> + /* >> + * None of the vcpus in WAIT state take airqs and we might >> + * have no running vcpus as at least one vcpu is in WAIT state >> + * and IPM is dirty. >> + */ >> + set_iam(kvm->arch.gisa, kvm->arch.iam); >> +} >> + >> +#define NULL_GISA_ADDR 0x00000000UL >> +#define NONE_GISA_ADDR 0x00000001UL >> +#define GISA_ADDR_MASK 0xfffff000UL >> + >> +static void __maybe_unused process_gib_alert_list(void) >> +{ >> + u32 final, next_alert, origin = 0UL; >> + struct kvm_s390_gisa *gisa; >> + struct kvm *kvm; >> + >> + do { >> + /* >> + * If the NONE_GISA_ADDR is still stored in the alert list >> + * origin, we will leave the outer loop. No further GISA has >> + * been added to the alert list by millicode while processing >> + * the current alert list. >> + */ >> + final = (origin & NONE_GISA_ADDR); >> + /* >> + * Cut off the alert list and store the NONE_GISA_ADDR in the >> + * alert list origin to avoid further GAL interruptions. >> + * A new alert list can be build up by millicode in parallel >> + * for guests not in the yet cut-off alert list. When in the >> + * final loop, store the NULL_GISA_ADDR instead. This will re- >> + * enable GAL interruptions on the host again. >> + */ >> + origin = xchg(&gib->alert_list_origin, >> + (!final) ? NONE_GISA_ADDR : NULL_GISA_ADDR); >> + /* Loop through the just cut-off alert list. */ >> + while (origin & GISA_ADDR_MASK) { >> + gisa = (struct kvm_s390_gisa *)(u64)origin; >> + next_alert = gisa->next_alert; >> + /* Unlink the GISA from the alert list. */ >> + gisa->next_alert = origin; >> + kvm = container_of(gisa, struct sie_page2, gisa)->kvm; >> + /* Kick suitable vcpus */ >> + __floating_airqs_kick(kvm); > > We may finish handling the alerted gisa with iam not set e.g. > via some vcpus kicked but ipm still dirty and some vcpus still in wait, > or? My above mentioned change to the routine identifying the vcpus to kick will select one vcpu for each ISC pending if possible (depends on the number of idle vcpus and their respective interruption masks and the pending ISCs). That does not exclude the principle scenario that maybe only one vcpu is kicked and multiple ISCs are pending (ipm still dirty) although have never observed this with a Linux guest. What I was trying to avoid was a kind of busy loop running in addition to the kicked vcpus monitoring the IPM state for resource utilization reasons. > > From the comments it seems we speculate on being in a safe state, as > these are supposed to return to wait or stop soon-ish, and we will set > iam then (See ). I don't quite understand. Yes, the next vcpu going idle shall restore the IAM or process the top ISC pending if the iomask (GCR) allows. vcpus are not allowed to go in disabled wait (IO int disabled by PSW). If all vcpus always (for some time) mask a specific ISC the guest does not want to get interrupted for that ISC but will as soon a running vcpu will open the mask again. > > According to my current understanding we might end up loosing initiative > in this scenario. Or am I wrong? I currently don't have proof for you being wrong but have not observed the situation yet. > > Regards, > Halil > >> + origin = next_alert; >> + } >> + } while (!final); >> +} >> + >> static void nullify_gisa(struct kvm_s390_gisa *gisa) >> { >> memset(gisa, 0, sizeof(struct kvm_s390_gisa)); > -