Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp1607415pxy; Thu, 29 Apr 2021 10:25:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzaD1hUio65kUGPb+8jFImGU+Ksj3COxcQWcY3b1S65DsAudvtGChXkD5QdEHTGmjo5n5bh X-Received: by 2002:a17:906:17cc:: with SMTP id u12mr1001998eje.170.1619717122455; Thu, 29 Apr 2021 10:25:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619717122; cv=none; d=google.com; s=arc-20160816; b=YAH9ZYmgB2/awoSwsTXqr4CLz8TNA/FPF6R2WbJZsn+M1m69dvc+RN3e3sUx6i+/Sf /Jvk1ST4q4+g8e/2VdJobjpS8dNzWwdBgDOSc7gQshkoE3doFG3ybnDXXJsLtAUx8OHn WGM9y37saUi9m7e7TctQ4ak1id8U2Bp1JfyoQzfs0pmxuxG3RcS2RRzS7/RVcTYPrjh4 KbosFpfVCIUz+oVwpTPWdwVAWcg1BUjB//dxJYWqm+sWMM/z/Rq0EnPkdvB7ZjDIWoSA C3kaGpvHTn7bIoFbRf/WsQi6y+EzjA0mkY24xFFvVRvDBeHKz05tISH1a3+DPvP0U2+U 0Zqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:message-id:references:in-reply-to :subject:cc:to:from:date:content-transfer-encoding:mime-version :sender:dkim-signature; bh=pvXdzkVfz5fgSIWOb+hqM4GsllkPEXOcoxireDKOK6U=; b=GoIkvyXp1+mGpAEklMiR25NH9a30Hu8oEQugZsJz8q9JVwlQeOpn6vlQGOgWtNI2Zz U7LIQkZQbeV3Mbz47dp9kN4nykZMXZLXsjYwtP28aDTOjdI8Nv5j26LjCVseaVn/ww9Q JAH6GFwIC8hY8qmqs4dm1yCOzBFdVJnSOFTqpWd8blcJeLcyiAplVX4+c6UAjAr7lEvR hWsDESRA0Nu1rkNTn5MnlQB+vnNJw2s0gmnDvpx4XrwFAttuBwd31y7v2WRnLEo7IV/C 7/Px7h/xDF8j+XwLIygGI55DhG/pv18WK1/KFezxX/740EWqYtwHTcrUei4O7uSPGKdP eygQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=LzCbPO4E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q10si578327ejy.742.2021.04.29.10.24.56; Thu, 29 Apr 2021 10:25:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=LzCbPO4E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240884AbhD2RYd (ORCPT + 99 others); Thu, 29 Apr 2021 13:24:33 -0400 Received: from m43-7.mailgun.net ([69.72.43.7]:51216 "EHLO m43-7.mailgun.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233333AbhD2RYc (ORCPT ); Thu, 29 Apr 2021 13:24:32 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1619717025; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=pvXdzkVfz5fgSIWOb+hqM4GsllkPEXOcoxireDKOK6U=; b=LzCbPO4EuOhQNC9d3z9vGgqjCa+eaEwWvF2vbmVtKmfLhhUaHagTyt7+5igOKTCIv3BxwFka uvxwfTbTy3Qth/zutSF9PXkqnYfC1MCmAIKVic6AE5TG1ADfraORfuO920G2UAgbgCNdED98 +C3xumh7fNxaWY7fzkdrVM9NgXA= X-Mailgun-Sending-Ip: 69.72.43.7 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n07.prod.us-west-2.postgun.com with SMTP id 608aeb952cc44d3aeab0a36b (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Thu, 29 Apr 2021 17:23:33 GMT Sender: khsieh=codeaurora.org@mg.codeaurora.org Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 10E51C4338A; Thu, 29 Apr 2021 17:23:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: khsieh) by smtp.codeaurora.org (Postfix) with ESMTPSA id 5808FC433D3; Thu, 29 Apr 2021 17:23:31 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Thu, 29 Apr 2021 10:23:31 -0700 From: khsieh@codeaurora.org To: Stephen Boyd Cc: aravindh@codeaurora.org, robdclark@gmail.com, sean@poorly.run, abhinavk@codeaurora.org, airlied@linux.ie, daniel@ffwll.ch, linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] drm/msm/dp: service only one irq_hpd if there are multiple irq_hpd pending In-Reply-To: References: <1618604877-28297-1-git-send-email-khsieh@codeaurora.org> <161895606268.46595.2841353121480638642@swboyd.mtv.corp.google.com> <9ccdef6e1a1b47bd8d99594831f51094@codeaurora.org> Message-ID: X-Sender: khsieh@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-04-29 02:26, Stephen Boyd wrote: > Quoting khsieh@codeaurora.org (2021-04-28 10:38:11) >> On 2021-04-27 17:00, Stephen Boyd wrote: >> > Quoting aravindh@codeaurora.org (2021-04-21 11:55:21) >> >> On 2021-04-21 10:26, khsieh@codeaurora.org wrote: >> >> >> >> >> >>> + >> >> >>> mutex_unlock(&dp->event_mutex); >> >> >>> >> >> >>> return 0; >> >> >>> @@ -1496,6 +1502,9 @@ int msm_dp_display_disable(struct msm_dp *dp, >> >> >>> struct drm_encoder *encoder) >> >> >>> /* stop sentinel checking */ >> >> >>> dp_del_event(dp_display, EV_DISCONNECT_PENDING_TIMEOUT); >> >> >>> >> >> >>> + /* link is down, delete pending irq_hdps */ >> >> >>> + dp_del_event(dp_display, EV_IRQ_HPD_INT); >> >> >>> + >> >> >> >> >> >> I'm becoming convinced that the whole kthread design and event queue >> >> >> is >> >> >> broken. These sorts of patches are working around the larger problem >> >> >> that the kthread is running independently of the driver and irqs can >> >> >> come in at any time but the event queue is not checked from the irq >> >> >> handler to debounce the irq event. Is the event queue necessary at >> >> >> all? >> >> >> I wonder if it would be simpler to just use an irq thread and process >> >> >> the hpd signal from there. Then we're guaranteed to not get an irq >> >> >> again >> >> >> until the irq thread is done processing the event. This would >> >> >> naturally >> >> >> debounce the irq hpd event that way. >> >> > event q just like bottom half of irq handler. it turns irq into event >> >> > and handle them sequentially. >> >> > irq_hpd is asynchronous event from panel to bring up attention of hsot >> >> > during run time of operation. >> >> > Here, the dongle is unplugged and main link had teared down so that no >> >> > need to service pending irq_hpd if any. >> >> > >> >> >> >> As Kuogee mentioned, IRQ_HPD is a message received from the panel and >> >> is >> >> not like your typical HW generated IRQ. There is no guarantee that we >> >> will not receive an IRQ_HPD until we are finished with processing of >> >> an >> >> earlier HPD message or an IRQ_HPD message. For example - when you run >> >> the protocol compliance, when we get a HPD from the sink, we are >> >> expected to start reading DPCD, EDID and proceed with link training. >> >> As >> >> soon as link training is finished (which is marked by a specific DPCD >> >> register write), the sink is going to issue an IRQ_HPD. At this point, >> >> we may not done with processing the HPD high as after link training we >> >> would typically notify the user mode of the newly connected display, >> >> etc. >> > >> > Given that the irq comes in and is then forked off to processing at a >> > later time implies that IRQ_HPD can come in at practically anytime. >> > Case >> > in point, this patch, which is trying to selectively search through the >> > "event queue" and then remove the event that is no longer relevant >> > because the display is being turned off either by userspace or because >> > HPD has gone away. If we got rid of the queue and kthread and processed >> > irqs in a threaded irq handler I suspect the code would be simpler and >> > not have to search through an event queue when we disable the display. >> > Instead while disabling the display we would make sure that the irq >> > thread isn't running anymore with synchronize_irq() or even disable the >> > irq entirely, but really it would be better to just disable the irq in >> > the hardware with a register write to some irq mask register. >> > >> > This pushes more of the logic for HPD and connect/disconnect into the >> > hardware and avoids reimplementing that in software: searching through >> > the queue, checking for duplicate events, etc. >> >> I wish we can implemented as you suggested. but it more complicate >> than >> that. >> Let me explain below, >> we have 3 transactions defined as below, >> >> plugin transaction: irq handle do host dp ctrl initialization and link >> training. If sink_count = 0 or link train failed, then transaction >> ended. otherwise send display up uevent to frame work and wait for >> frame >> work thread to do mode set, start pixel clock and start video to end >> transaction. > > Why do we need to wait for userspace to start video? HPD is indicating > that we have something connected, so shouldn't we merely signal to > userspace that something is ready to display and then enable the irq > for > IRQ_HPD? > yes, it is correct. The problem is unplug happen after signal user space. if unplug happen before user space start mode set and video, then it can just do nothing and return. but if unplugged happen at the middle of user space doing mode set and start video? remember we had run into problem system show in connect state when dongle unplugged, vice versa. >> >> unplugged transaction: irq handle send display off uevent to frame >> work and wait for frame work to disable pixel clock ,tear down main >> link and dp ctrl host de initialization. > > What do we do if userspace is slow and doesn't disable the display > before the cable is physically plugged in again? > plugin is not handle (re enter back into event q) until unplugged handle completed. >> >> irq_hpd transaction: This only happen after plugin transaction and >> before unplug transaction. irq handle read panel dpcd register and >> perform requesting action. Action including perform dp compliant >> phy/link testing. >> >> since dongle can be plugged/unplugged at ant time, three conditions >> have >> to be met to avoid race condition, >> 1) no irq lost >> 2) irq happen timing order enforced at execution >> 3) no irq handle done in the middle transaction >> >> for example we do not want to see >> plugin --> unplug --> plugin --> unplug become plugin --> plugin--> >> unplug >> >> The purpose of this patch is to not handle pending irq_hpd after >> either >> dongle or monitor had been unplugged until next plug in. >> > > I'm not suggesting to block irq handling entirely for long running > actions. A plug irq due to HPD could still notify userspace that the > display is connected but when an IRQ_HPD comes in we process it in the > irq thread instead of trying to figure out what sort of action is > necessary to quickly fork it off to a kthread to process later. > > The problem seems to be that this quick forking off of the real IRQ_HPD > processing is letting the event come in, and then an unplug to come in > after that, and then a plug in to come in after that, leading to the > event queue getting full of events that are no longer relevant but > still > need to be processed. If this used a workqueue instead of an open-coded > one, I'd say we should cancel any work items on the queue if an unplug > irq came in. That way we would make sure that we're not trying to do > anything with the link when it isn't present anymore. > is this same as we delete irq_hpd from event q? What happen if the workqueue had been launched? > But even then it doesn't make much sense. Userspace could be heavily > delayed after the plug in irq, when HPD is asserted, and not display > anything. The user could physically unplug and plug during that time so > we really need to not wait at all or do anything besides note the state > of the HPD when this happens. The IRQ_HPD irq is different. I don't > think we care to keep getting them if we're not done processing the > previous irq. I view it as basically an "edge" irq that we see, > process, > and then if another one comes in during the processing time we ignore > it. There's only so much we can do, hence the suggestion to use a > threaded irq. > I do not think you can ignore irq_hpd. for example, you connect hdmi monitor to dongle then plug in dongle into DUT and unplug hdmi monitor immediatly. DP driver will see plugin irq with sink_count=1 followed by irq_hpd with sink_count= 0. Then we may end up you think it is in connect state but actually it shold be in disconnect state. I do not think we can ignore irq_hpd but combine multiple irq_hpd into one and handle it. > This is why IRQ_HPD is yanking the HPD line down to get the attention > of > the source, but HPD high and HPD low for an extended period of time > means the cable has been plugged or unplugged. We really do care if the > line goes low for a long time, but if it only temporarily goes low for > an IRQ_HPD then we either saw it or we didn't have time to process it > yet. > > It's like a person at your door ringing the doorbell. They're there > (HPD > high), and they're ringing the doorbell over and over (IRQ_HPD) and > eventually they go away when you don't answer (HPD low). We don't have > to keep track of every single doorbell/IRQ_HPD event because it's > mostly > a ping from the sink telling us we need to go do something, i.e. a > transitory event. The IRQ_HPD should always work once HPD is there, but > once HPD is gone we should mask it and ignore that irq until we see an > HPD high again. if amazon deliver ring the door bell 3 times, then we answer the door at the third time. this mean the first and second door bell ring can be ignored. Also if door bell ring 3 times and left an package at door then deliver left, you saw deliver left form window then you still need to answer to find out there is package left at door. If you ignore doorbell, then you will missed the package. I believe both thread_irq and event q works. But I think event q give us more finer controller. We are trying to fix an extreme case which generate un expected number of irq_hpd at an unexpected timing. I believe other dp driver (not Qcom) will also failed on this particular case.