Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp6457779pxv; Thu, 29 Jul 2021 15:28:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzxPc7GZRwHWcFG6Uq3D8PiLB3xX4VuIoPy4gOBwn+qJDZsXv9wTBlVvVaDuTVKa7vV0f92 X-Received: by 2002:a92:d112:: with SMTP id a18mr5128616ilb.67.1627597704783; Thu, 29 Jul 2021 15:28:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627597704; cv=none; d=google.com; s=arc-20160816; b=WVz4EJ7LmvgH4T5eg4DxYpkJpQsqyINhXQTGIDWkR0sMGbWTE4nXl2gDgtwyBAZrce XTS+zzJAMO6OPlyY+cq7PpdHLq/MGJMBXLykd95bR0ZFKEisrbNnXaHP8ZV3f1C2ztR7 3DBOcjqVATMN1Kr6o8GSNdM0wW++33aZ31N2amFY8BpGWeHYnk/O2Koe5ySxMTZCwR88 CHdDRYd6FwOHKkTPXOFLaUxFY/wdKpLkAjCvAWvTwxJ3S7Zi9DZM5trPihi7T+FumIuK vJbTwkPiRIyPyrXW2OczBKQN3nfTWDqXnNmMSn8IKNUKgSiVtOZfIKvlLM4p+V8QraOY OoVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=0qEtOVI+2bxrwcA0/khpilqHg8S4yKEf0/piqF/74tU=; b=LyhWRuLmYkDKp8F2r+/l+gR8A5NJN/QvfG9wxIb/xVCe9aDrHZtXmVBQoDYCz33nD+ 38b/jFzgme1GLKmmxdaiI3eoIjgwPwDdv0cXZ6BkfTrAmO7BwShNRBm9EIJVwJ3/WkP0 S5sieOufUM6QgAklWgJatMXKZ01VCxcSfzZMOUXZ6PV6NXvoDEZ8eTO23szgfPSIapgz b8CIULPg9/dKSWIiMGKgwY770peVP2fbhKxhi4tTTA6fSyrZw/MyTN+RWxiWSIVTjs+Z 8Yc+KZkvuo3RUv7XUDyKlYRLewIB7v8d9EnPpRHg+WNcO0SEZONLVO+XRW/wsRgTAdCe H1Rw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@android.com header.s=20210112 header.b=WNyPqykz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=android.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a16si4600849ilb.149.2021.07.29.15.28.12; Thu, 29 Jul 2021 15:28:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@android.com header.s=20210112 header.b=WNyPqykz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=android.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229749AbhG2W11 (ORCPT + 99 others); Thu, 29 Jul 2021 18:27:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229510AbhG2W11 (ORCPT ); Thu, 29 Jul 2021 18:27:27 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A2A8C061765 for ; Thu, 29 Jul 2021 15:27:23 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id u9-20020a17090a1f09b029017554809f35so17821930pja.5 for ; Thu, 29 Jul 2021 15:27:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=android.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0qEtOVI+2bxrwcA0/khpilqHg8S4yKEf0/piqF/74tU=; b=WNyPqykzza9gkT1w9OQXvq0HkGQimtoedVVSXRi/bk9yvFDLA2dhQm4bE490HMhoLB fxlXtzImR2x82rDCE0GlyIV0aeBfcAlYTQD+XYdVjWOrvhwiVsO9Zfq2fEJTpZI0uMW2 O6h387/SWxbh1Efq7/BdE0XpThwQN/gcuhN3QUpefrigO9x9unlsJGhOZ98RSTFZLCOZ 7YKPcVKu6txlyzL7zxT9H9010oJPlQBHExj7CyNUnHLiEeElSUpl7rTa7cCO4/GaQPvq ER1ZWF/WvryYLQ6IHMrR69U1zHxKDN1fbkmyeGs1gTfwosrNf6c7ePuARagnfJXOHY2X WBhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0qEtOVI+2bxrwcA0/khpilqHg8S4yKEf0/piqF/74tU=; b=P2eABYGx1Q1LbMqd6oklg3UTbBoRvs9TLp8o43QmzuxvfvOyedukA451yRO9UoBO28 eoSmXuT3XJ18iPWYkY9MCvL6/xRlBMe5XQcvPd5+KBfRoSnqSqLbWqoEzrKuAGSNPORl GTiraF2UYvv/BVyEWcvWDrhWSeYiqwBWlrZjhV1zvukvCUvalA39D8Ji0atL4T8ddPS5 4Y8TliJsyzZ0GOg8jNJH6L8oVjM5HCtck1K5c4IiuKyuR8Ju+QEOJlKcwFO6z94BnDnC 8wXn9O8SZw1AEpioiAzFUXXSCwnDTb09gYSQo9ZPHwxuUPD85lDvFWwdA8RboVUFrBLc 1bVQ== X-Gm-Message-State: AOAM5326ilO2Aw3lTsEZpGe+N9ad01ivPN1AnWNHQwJFML8eY+xqYp6K u8dTOXri7QmL7flMGq85FOWIgg== X-Received: by 2002:a17:90a:a105:: with SMTP id s5mr7506299pjp.9.1627597643146; Thu, 29 Jul 2021 15:27:23 -0700 (PDT) Received: from sspatil2.c.googlers.com.com (190.40.105.34.bc.googleusercontent.com. [34.105.40.190]) by smtp.gmail.com with ESMTPSA id 5sm4761989pfp.154.2021.07.29.15.27.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 15:27:22 -0700 (PDT) From: Sandeep Patil To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Sandeep Patil , torvalds@linux-foundation.org, dhowells@redhat.com, gregkh@linuxfoundation.org, stable@vger.kernel.org, kernel-team@android.com Subject: [PATCH 0/1] Revert change in pipe reader wakeup behavior Date: Thu, 29 Jul 2021 22:26:34 +0000 Message-Id: <20210729222635.2937453-1-sspatil@android.com> X-Mailer: git-send-email 2.32.0.554.ge1b32706d8-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The commit <1b6b26ae7053>("pipe: fix and clarify pipe write wakeup logic") changed pipe write logic to wakeup readers only if the pipe was empty at the time of write. However, there are libraries that relied upon the older behavior for notification scheme similar to what's described in [1] One such library 'realm-core'[2] is used by numerous Android applications. The library uses a similar notification mechanism as GNU Make but it never drains the pipe until it is full. When Android moved to v5.10 kernel, all applications using this library stopped working. The C program at the end of this email mimics the library code. The program works with 5.4 kernel. It fails with v5.10 and I am fairly certain it will fail wiht v5.5 as well. The single patch in this series restores the old behavior. With the patch, the test and all affected Android applications start working with v5.10 After reading through epoll(7), I think the pipe should be drained after each epoll_wait() comes back. Also, that a non-empty pipe is considered to be "ready" for readers. The problem is that prior to the commit above, any new data written to non-empty pipes would wakeup threads waiting in epoll(EPOLLIN|EPILLET) and thats how this library worked. I do think the program below is using EPOLLET wrong. However, it used to work before and now it doesn't. So, I thought it is worth asking if this counts as userspace break. There was also a symmetrical change made to pipe_read in commit ("pipe: fix and clarify pipe read wakeup logic") that I am not sure needs changing. The library has since been fixed[3] but it will be a while before all applications incorporate the updated library. - ssp 1. https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/ 2. https://github.com/realm/realm-core 3. https://github.com/realm/realm-core/issues/4666 ==== #include #include #include #include #include #include #include #include #include #include #include #define FIFO_NAME "epoll-test-fifo" pthread_t tid; int max_delay_ms = 20; int fifo_fd; unsigned char written; unsigned char received; int epoll_fd; void *wait_on_fifo(void *unused) { while (1) { struct epoll_event ev; int ret; unsigned char c; ret = epoll_wait(epoll_fd, &ev, 1, 5000); if (ret == -1) { /* If interrupted syscall, continue .. */ if (errno == EINTR) continue; /* epoll_wait failed, bail.. */ error(99, errno, "epoll_wait failed \n"); } /* timeout */ if (ret == 0) break; if (ev.data.fd == fifo_fd) { /* Assume this is notification where the thread is catching up. * pipe is emptied by the writer when it detects it is full. */ received = written; } } return NULL; } int write_fifo(int fd, unsigned char c) { while (1) { int actual; char buf[1024]; ssize_t ret = write(fd, &c, 1); if (ret == 1) break; /* * If the pipe's buffer is full, we need to read some of the old data in * it to make space. We dont read in the code waiting for * notifications so that we can notify multiple waiters with a single * write. */ if (ret != 0) { if (errno != EAGAIN) return -EIO; } actual = read(fd, buf, 1024); if (actual == 0) return -errno; } return 0; } int create_and_setup_fifo() { int ret; char fifo_path[4096]; struct epoll_event ev; char *tmpdir = getenv("TMPDIR"); if (tmpdir == NULL) tmpdir = "."; ret = sprintf(fifo_path, "%s/%s", tmpdir, FIFO_NAME); if (access(fifo_path, F_OK) == 0) unlink(fifo_path); ret = mkfifo(fifo_path, 0600); if (ret < 0) error(1, errno, "Failed to create fifo"); fifo_fd = open(fifo_path, O_RDWR | O_NONBLOCK); if (fifo_fd < 0) error(2, errno, "Failed to open Fifo"); ev.events = EPOLLIN | EPOLLET; ev.data.fd = fifo_fd; ret = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, fifo_fd, &ev); if (ret < 0) error(4, errno, "Failed to add fifo to epoll instance"); return 0; } int main(int argc, char *argv[]) { int ret, random; unsigned char c = 1; epoll_fd = epoll_create(1); if (epoll_fd == -1) error(3, errno, "Failed to create epoll instance"); ret = create_and_setup_fifo(); if (ret != 0) error(45, EINVAL, "Failed to setup fifo"); ret = pthread_create(&tid, NULL, wait_on_fifo, NULL); if (ret != 0) error(2, errno, "Failed to create a thread"); srand(time(NULL)); /* Write 256 bytes to fifo one byte at a time with random delays upto 20ms */ do { written = c; ret = write_fifo(fifo_fd, c); if (ret != 0) error(55, errno, "Failed to notify fifo, write #%u", (unsigned int)c); c++; random = rand(); usleep((random % max_delay_ms) * 1000); } while (written <= c); /* stop after c = 255 */ pthread_join(tid, NULL); printf("Test: %s", written == received ? "PASS\n" : "FAIL"); if (written != received) printf(": Written (%d) Received (%d)\n", written, received); close(fifo_fd); close(epoll_fd); return 0; } ==== Sandeep Patil (1): fs: pipe: wakeup readers everytime new data written is to pipe fs/pipe.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) -- 2.32.0.554.ge1b32706d8-goog