Received: by 2002:a25:2c96:0:0:0:0:0 with SMTP id s144csp726534ybs; Sun, 24 May 2020 19:48:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzqY8o66FgCjxhhjZn4/YhOmExDcgfGIrkresa7M9LhYTbIdtQ9RkM20BRekfFyfv86QXRV X-Received: by 2002:a17:906:24cf:: with SMTP id f15mr17459864ejb.462.1590374920168; Sun, 24 May 2020 19:48:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590374920; cv=none; d=google.com; s=arc-20160816; b=in9bhBuFqMlOeKcAFFLWdcPhuQwycyVLljipdyQSRuCkKCSXaZyxDV/LHfzLFbYuuF DRNO59WUY64KlhRbtqE//tMvNI5wbv7rMH++ZKyWgA1rs8s/1NAFCoKDUITqyt7fLibX dJZBS4JoFbkITNTjZL6QeMvLlMqESHmFKKTshjeBxxW7vQD9SwPgE4XusLnl5HFXkvQj 6QH4LOWQL2kBs0YnlgycgKeqnfgLtsuTq+/7NboZ4XlA/LVhyz7CyqA1ZIGThoOFSgUV 9AngiywqEhSGUqOA2iwMY582pEvLgYYy4sPP4BFaQeuDn4KdeUq8uIE8K9GToJhBGzFg drPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :mime-version:dkim-signature; bh=sExaUQG+IPyncAhsVw5lguZ0tU5GKQsX5eYb5m1nJfc=; b=NlDoZseub5zYbJXoWqF9TM4rrkT6A9kDYArex/09Hy1E3gcQM049pXFL/8hv5lBnwb K5HOmBXzPVLXkzdrWAxuHDOva+yIgwWVOem7QKgqFVQkx7CJNWyRcy9nG3msLp84aLUr JgaesSeTHeMlL6FrJOdkDVp6+MVWv3DkKHWx50Uxh9hx3xBztmH6+CpwiM1xYI136ceh xzi14Rrlmr4z6DZUQqqAVjd8ysgElbn1ygVZFTy4n9wME7oeM04uvcu/4s2s4RG69Jg+ JK544nzN6qqbvJdeKvvx3m9oNPMy3LR5h8TnLFV8R3OkOba+vhKmLJnSh1oLK9A+D1FS 9OfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@juliacomputing-com.20150623.gappssmtp.com header.s=20150623 header.b=a7DtJGwD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w5si8895919eja.332.2020.05.24.19.48.15; Sun, 24 May 2020 19:48:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@juliacomputing-com.20150623.gappssmtp.com header.s=20150623 header.b=a7DtJGwD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388671AbgEYCZz (ORCPT + 99 others); Sun, 24 May 2020 22:25:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388110AbgEYCZy (ORCPT ); Sun, 24 May 2020 22:25:54 -0400 Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9642EC061A0E for ; Sun, 24 May 2020 19:25:53 -0700 (PDT) Received: by mail-io1-xd43.google.com with SMTP id q8so15912252iow.7 for ; Sun, 24 May 2020 19:25:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juliacomputing-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to:cc; bh=sExaUQG+IPyncAhsVw5lguZ0tU5GKQsX5eYb5m1nJfc=; b=a7DtJGwDeRXIAvsQzFAlpmcS0vnsn60H7HaxhJCV61CkGZ7U8QrghWMLxRQ91v2xIT KemEV/6bFVb6MAO/j00VYeSPOwL4XwCrfhCQaqU/LEz/zTLQO1JoqOEdj8UhjIsZlhoq Sq74/QlFJ1PLHFIQmW+xwzO5PcWIdP9VJnbfttTt2dS3Tz3e6E1AH20sChQpsDWHoW1z mCXTwhXPkTq0ViMyrwHK8diWJoAP68ZaLbCDN44lC1aSicREKkA9nLTCX751CVB/qAo3 nJhWWuNDtD4saJFzUgTXYZPP3yrw5zXA/4Ocj4CrgXwhYNFAiJ9Iyb12gQ4/eE048lQf xV3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=sExaUQG+IPyncAhsVw5lguZ0tU5GKQsX5eYb5m1nJfc=; b=Y2PL0Z2BFJT0gVF9/qyjOWbhDnhewoy3QNweMyqsJ6K0eyfMrWROKffqBvYiikPMZ0 woKqPW9ZlweyC13tJWUJeWD7FCtq4p1Cph3Kz1XqDNeJVfyIqObDxkuCiO+IIoynWT0M ZH9N8PLUY62SciO1oyEHJsG2622mlgVxf604lvi55NdO6OYN6sNh73zvwEpumxfK+er6 6QVzw46cnKGBuM/qwi6vKiqf+c2pAR+kZyTcofn4cDP3bZmYQgnDZDfJ+DDVZMV0fkFz Vvqbd+3zNTn3uLruOPBTKhEacWdag9OC/fq7t5xWok9qdYCSOim0oLfNwpv7KHPkDJ1+ CMBA== X-Gm-Message-State: AOAM530y8SXzqY2TBBCI0TMglr1uC0DfCqZ5B/hIB4lAZtdzNiuO5zet B53jdq2gyEiWiZNJVcj/EoHrzaHVgRHq8DGlynE/xQ== X-Received: by 2002:a5e:df49:: with SMTP id g9mr8286037ioq.153.1590373552754; Sun, 24 May 2020 19:25:52 -0700 (PDT) MIME-Version: 1.0 From: Keno Fischer Date: Sun, 24 May 2020 22:25:17 -0400 Message-ID: Subject: mm: Behavior of process_vm_* with short local buffers To: linux-mm@kvack.org Cc: Andrew Morton , Linux Kernel Mailing List , Kyle Huey , "Robert O'Callahan" , Valentin Churavy Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi everyone, I'm in the process of trying to port a debugging tool (http://rr-project.org/) from x86 to various other architectures. This tool relies on noting every change that was made to the memory of the process being debugged. As such, it has a battery of tests for corner cases of copyin/out and it is one of these that I saw behaving strangely when ported to non-x86 architectures. This particular test was testing the behavior of process_vm_readv (and writev, but for simplicity, let's assume readv here) with short local buffers. On x86 if the buffer is short and the following page is unmapped, the syscall will fill the remainder of the page, and then return however many bytes it actually wrote. However, on other architectures (I mostly looked at arm64, though the same applies elsewhere), the behavior can be quite different. In general, the behavior depends strongly on factors like how close to the start of the copy region the page break occurs, how many bytes were supposed to be left after the page break and the total size of the region to be copied. In various situations, I'm seeing: - Writes that end many bytes before the page break - Bytes being modified beyond what the syscall result would indicate happened. - Combinations thereof I can work around this in my port, but I thought it might be valuable to ask where the line is between "architecture-defined behavior" and a bug that should be reported to the appropriate architecture maintainers and eventually fixed. For example, I think it would be nice if the syscall result actually did match the actual number of bytes written in all cases. I've written a small program [1] that sets up this situation for various parameter values and prints the results. I have access to arm64, powerpc and x86, so I included results for those architectures, but I suspect other architectures have similar issues. The program should be easy to run to get your own results for a different architecture. [1] https://gist.github.com/Keno/b247bca85219c4e3bdde9f7d7ff36c77 Thanks, Keno