So unless something odd happens, this should be the last rc for 4.18.
Nothing particularly odd happened this last week - we got the usual
random set of various minor fixes all over. About two thirds of it is
drivers - networking, staging and usb stands out, but there's a little
bit of stuff all over (clk, block, gpu, nvme..).
Outside of drivers, the bulk is some core networking stuff, with
random changes elsewhere (minor arch updates, filesystems, core
kernel, test scripts).
The appended shortlog gives a flavor of the details.
Linus
---
Adam Thomson (1):
usb: typec: tcpm: Fix sink PDO starting index for PPS APDO selection
Aleksander Morgado (1):
qmi_wwan: fix interface number for DW5821e production firmware
Alexander Sverdlin (1):
i2c: davinci: Avoid zero value of CLKH
Amar Singhal (1):
cfg80211: never ignore user regulatory hint
Anssi Hannula (7):
can: xilinx_can: fix device dropping off bus on RX overrun
can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK
can: xilinx_can: fix recovery from error states not being propagated
can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting
can: xilinx_can: fix RX overflow interrupt not being enabled
can: xilinx_can: fix incorrect clear of non-processed interrupts
can: xilinx_can: fix power management handling
Antti Seppälä (2):
usb: dwc2: Fix DMA alignment to start at allocated boundary
usb: dwc2: Fix inefficient copy of unaligned buffers
Ariel Levkovich (1):
net/mlx5: Adjust clock overflow work period
Arnd Bergmann (2):
kasan: only select SLUB_DEBUG with SYSFS=y
include/linux/eventfd.h: include linux/errno.h
Artem Savkov (1):
tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure
Benjamin Herrenschmidt (4):
clk: aspeed: Treat a gate in reset as disabled
usb: gadget: aspeed-vhub: Fix SETUP packets with OUT data phase
usb: gadget: Fix OS descriptors support
usb: gadget: aspeed: Workaround memory ordering issue
Bernd Edlinger (1):
nl80211: Add a missing break in parse_station_flags
Bin Liu (1):
usb: core: handle hub C_PORT_OVER_CURRENT condition
Carlos Maiolino (1):
xfs: Initialize variables in xfs_alloc_get_rec before using them
Chen-Yu Tsai (1):
Input: i8042 - add Lenovo LaVie Z to the i8042 reset list
Clint Taylor (1):
drm/i915/glk: Add Quirk for GLK NUC HDMI port issues.
Colin Ian King (1):
usb: dwc3: gadget: remove redundant variable maxpacket
Dan Williams (1):
mm: fix exports that inadvertently make put_page() EXPORT_SYMBOL_GPL
Daniel Borkmann (3):
bpf, ppc64: fix unexpected r0=0 exit path inside bpf_xadd
bpf: test case to check whether src/dst regs got mangled by xadd
sock: fix sg page frag coalescing in sk_alloc_sg
Dave Jiang (1):
mm: disallow mappings that conflict for devm_memremap_pages()
David Ahern (1):
net/ipv6: Fix linklocal to global address with VRF
Davidlohr Bueso (1):
ipc/sem.c: prevent queue.status tearing in semop
Denis Kenzior (1):
nl80211/mac80211: allow non-linear skb in rx_control_port
Dirk Mueller (1):
arm64: Check for errata before evaluating cpu features
Dmitry Torokhov (1):
usb: dwc2: host: do not delay retries for CONTROL IN transfers
Donald Shanty III (1):
Input: elan_i2c - add ACPI ID for lenovo ideapad 330
Doron Roberts-Kedes (1):
tls: check RCV_SHUTDOWN in tls_wait_data
Enric Balletbo i Serra (1):
usb: dwc3: rockchip: Fix PHY documentation links.
Eran Ben Elisha (2):
net/mlx5e: Fix quota counting in aRFS expire flow
net/mlx5e: Don't allow aRFS for encapsulated packets
Eric Dumazet (6):
net: skb_segment() should not return NULL
tcp: free batches of packets in tcp_prune_ofo_queue()
tcp: avoid collapses in tcp_prune_queue() if possible
tcp: detect malicious patterns in tcp_collapse_ofo_queue()
tcp: call tcp_drop() from tcp_data_queue_ofo()
tcp: add tcp_ooo_try_coalesce() helper
Eric Sandeen (1):
xfs: properly handle free inodes in extent hint validators
Esben Haabendal (1):
i2c: imx: Fix reinit_completion() use
Eugeniu Rosca (3):
usb: gadget: f_uac2: fix error handling in afunc_bind (again)
usb: gadget: u_audio: fix pcm/card naming in g_audio_setup()
usb: gadget: f_uac2: fix endianness of 'struct cntrl_*_lay3'
Fabio Estevam (1):
usb: chipidea: Always build ULPI code
Faiz Abbas (2):
can: m_can: Fix runtime resume call
can: m_can: Move accessing of message ram to after clocks are enabled
Felix Fietkau (1):
MIPS: ath79: fix register address in ath79_ddr_wb_flush()
Florian Westphal (7):
netfilter: nf_tables: use dev->name directly
netfilter: nf_tables: free flow table struct too
netfilter: nf_tables: fix memory leaks on chain rename
netfilter: nf_tables: don't allow to rename to already-pending name
netfilter: conntrack: dccp: treat SYNC/SYNCACK as invalid if no
prior state
atl1c: reserve min skb headroom
netfilter: nf_tables: move dumper state allocation into ->start
Geert Uytterhoeven (2):
MAINTAINERS: Add file patterns for serio device tree bindings
clk: Really show symbolic clock flags in debugfs
Greg Edwards (1):
block: reset bi_iter.bi_done after splitting bio
Gregory CLEMENT (1):
clk: mvebu: armada-37xx-periph: Fix switching CPU rate from
300Mhz to 1.2GHz
Guenter Roeck (1):
media: staging: omap4iss: Include asm/cacheflush.h after generic includes
Hangbin Liu (1):
multicast: do not restore deleted record source filter mode to new one
Hannes Reinecke (2):
nvmet: fixup crash on NULL device path
nvmet: only check for filebacking on -ENOTBLK
Hans de Goede (1):
Revert "staging:r8188eu: Use lib80211 to support TKIP"
Heiner Kallweit (2):
net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv
r8169: restore previous behavior to accept BIOS WoL settings
Jack Morgenstein (1):
net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
Jaedon Shin (1):
phy: phy-brcm-usb-init: Fix power down USB 3.0 PHY when XHCI reenabled
James Smart (2):
nvmet-fc: fix target sgl list on large transfers
nvme: if_ready checks to fail io to deleting controller
Jarod Wilson (1):
bonding: set default miimon value for non-arp modes if not set
Jerome Brunet (1):
clk: meson: audio-divider is one based
Jerry Zhang (1):
usb: gadget: f_fs: Only return delayed status when len is 0
Jia-Ju Bai (2):
usb: gadget: r8a66597: Fix two possible sleep-in-atomic-context
bugs in init_controller()
usb: gadget: r8a66597: Fix a possible sleep-in-atomic-context
bugs in r8a66597_queue()
Joel Stanley (2):
clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical
clk: aspeed: Support HPLL strapping on ast2400
Johannes Weiner (1):
arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups
John Hurley (1):
nfp: flower: ensure dead neighbour entries are not offloaded
John Keeping (1):
usb: dwc2: avoid NULL dereferences
Josef Bacik (2):
nbd: don't requeue the same request twice.
nbd: handle unexpected replies better
Joshua Frkuska (1):
usb: gadget: u_audio: update hw_ptr in iso_complete after data copied
KT Liao (1):
Input: elan_i2c - add another ACPI ID for Lenovo Ideapad 330-15AST
Keith Busch (2):
blk-mq: export setting request completion state
scsi: set timed out out mq requests to complete
Kiran Kumar Modukuri (5):
fscache: Allow cancelled operations to be enqueued
cachefiles: Fix refcounting bug in backing-file read monitoring
fscache: Fix reference overput in fscache_attach_object() error handling
cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag
cachefiles: Wait rather than BUG'ing on "Unexpected object collision"
Kirill A. Shutemov (3):
mm: introduce vma_init()
mm: use vma_init() to initialize VMAs on stack and data segments
mm: fix vma_is_anonymous() false-positives
Li Wang (1):
zswap: re-check zswap_is_full() after do zswap_shrink()
Linus Torvalds (2):
squashfs: be more careful about metadata corruption
Linux 4.18-rc7
Linus Walleij (1):
gpio: of: Handle fixed regulator flags properly
Lubomir Rintel (1):
usb: cdc_acm: Add quirk for Castles VEGA3000
Lucas Stach (2):
drm/imx: imx-ldb: disable LDB on driver bind
drm/imx: imx-ldb: check if channel is enabled before printing warning
Martin KaFai Lau (2):
bpf: btf: Clean up BTF_INT_BITS() in uapi btf.h
bpf: Use option "help" in the llvm-objcopy test
Martin Schwidefsky (1):
s390: disable gcc plugins
Martin Wilck (3):
block: bio_iov_iter_get_pages: fix size of last iovec
blkdev: __blkdev_direct_IO_simple: fix leak in error case
block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs
Masahiro Yamada (1):
gpio: uniphier: set legitimate irq trigger type in .to_irq hook
Masami Hiramatsu (2):
ring_buffer: tracing: Inherit the tracing setting to next ring buffer
selftests/ftrace: Add snapshot and tracing_on test case
Neil Armstrong (1):
clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL
Nicholas Mc Guire (1):
can: mpc5xxx_can: check of_iomap return before use
Paolo Abeni (1):
ip: hash fragments consistently
Paul Burton (1):
MIPS: Fix off-by-one in pci_resource_to_user()
Peter Senna Tschudin (1):
tools: usb: ffs-test: Fix build on big endian systems
Raed Salem (1):
net/mlx5: Fix 'DON'T_TRAP' functionality
Rafael J. Wysocki (1):
driver core: Partially revert "driver core: correct device's
shutdown order"
Rafał Miłecki (1):
Revert "MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum"
Randy Dunlap (2):
usb/phy: fix PPC64 build errors in phy-fsl-usb.c
net: prevent ISA drivers from building on PPC32
Roi Dayan (1):
net/mlx5e: Only allow offloading decap egress (egdev) flows
Roman Fietze (1):
can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit
before checking can.ctrlmode
Roopa Prabhu (4):
rtnetlink: add rtnl_link_state check in rtnl_configure_link
vxlan: add new fdb alloc and create helpers
vxlan: make netlink notify in vxlan_fdb_destroy optional
vxlan: fix default fdb entry netlink notify ordering during netdev create
Saeed Mahameed (1):
net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode
Samuel Thibault (1):
staging: speakup: fix wraparound in uaccess length check
Schmauss, Erik (1):
ACPICA: AML Parser: ignore dispatcher error status during table load
Sergio Paracuellos (1):
staging: ks7010: call 'hostif_mib_set_request_int' instead of
'hostif_mib_set_request_bool'
Shakeel Butt (1):
kvm, mm: account shadow page tables to kmemcg
Shay Agroskin (1):
net/mlx5e: Refine ets validation function
Shubhrajyoti Datta (1):
net: axienet: Fix double deregister of mdio
Snild Dolkow (1):
kthread, tracing: Don't expose half-written comm when creating kthreads
Stephane Grosjean (1):
can: peak_canfd: fix firmware < v3.3.0: limit allocation to
32-bit DMA addr only
Steve Longerbeam (1):
gpu: ipu-csi: Check for field type alternate
Steven Rostedt (VMware) (3):
tracing: Fix double free of event_trigger_data
tracing: Fix possible double free in event_enable_trigger_func()
tracing: Quiet gcc warning about maybe unused link variable
Sudarsana Reddy Kalluru (4):
qed: Fix link flap issue due to mismatching EEE capabilities.
qed: Fix possible race for the link state value.
qed: Correct Multicast API to reflect existence of 256
approximate buckets.
bnx2x: Fix invalid memory access in rss hash config path.
Taehee Yoo (3):
netfilter: nf_tables: fix jumpstack depth validation
netfilter: nft_set_hash: add rcu_barrier() in the nft_rhash_destroy()
netfilter: nft_set_rbtree: fix panic when destroying set by GC
Taeung Song (1):
tools/bpftool: Fix segfault case regarding 'pin' arguments
Tariq Toukan (2):
net/mlx5: Fix QP fragmented buffer allocation
net/page_pool: Fix inconsistent lock state warning
Tejun Heo (1):
delayacct: fix crash in delayacct_blkio_end() after delayacct init failure
Theodore Ts'o (6):
ext4: fix false negatives *and* false positives in
ext4_check_descriptors()
ext4: clear mmp sequence number when remounting read-only
ext4: fix inline data updates with checksums enabled
ext4: check for allocation block validity with block group locked
random: mix rdrand with entropy sent in from userspace
ext4: fix check to prevent initializing reserved inodes
Thomas Tai (1):
PCI/AER: Work around use-after-free in pcie_do_fatal_recovery()
Tony Lindgren (1):
phy: mapphone-mdm6600: Fix wrong enum used for status lines
Uwe Kleine-König (1):
net: dsa: mv88e6xxx: fix races between lock and irq freeing
Vinod Koul (1):
clk: qcom: gcc-msm8996: Disable halt check on UFS tx clock
Vivek Gautam (1):
clk/mmcc-msm8996: Make mmagic_bimc_gdsc ALWAYS_ON
Vladimir Zapolskiy (3):
usb: gadget: u_audio: remove caching of stream buffer parameters
usb: gadget: u_audio: remove cached period bytes value
usb: gadget: u_audio: protect stream runtime fields with stream spinlock
Wei Wang (1):
ipv6: use fib6_info_hold_safe() when necessary
Willem de Bruijn (1):
ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull
Wolfram Sang (2):
i2c: rcar: handle RXDMA HW behaviour on Gen3
i2c: imx: use open drain for recovery GPIO
Yuchung Cheng (3):
tcp: helpers to send special DCTCP ack
tcp: do not cancel delay-AcK on DCTCP special ACK
tcp: do not delay ACK in DCTCP upon CE status change
YueHaibing (3):
net: caif: Add a missing rcu_read_unlock() in caif_flow_cb
bpfilter: Fix mismatch in function argument types
cpufreq: qcom-kryo: add NULL entry to the end of_device_id array
Zhao Chen (1):
net-next/hinic: fix a problem in hinic_xmit_frame()
Zheng Xiaowei (1):
usb: xhci: Fix memory leak in xhci_endpoint_reset()
[email protected] (1):
mac80211: add stations tied to AP_VLANs during hw reconfig
On Mon, 30 Jul 2018 at 03:39, Linus Torvalds
<[email protected]> wrote:
>
> So unless something odd happens, this should be the last rc for 4.18.
>
> Nothing particularly odd happened this last week - we got the usual
> random set of various minor fixes all over. About two thirds of it is
> drivers - networking, staging and usb stands out, but there's a little
> bit of stuff all over (clk, block, gpu, nvme..).
>
> Outside of drivers, the bulk is some core networking stuff, with
> random changes elsewhere (minor arch updates, filesystems, core
> kernel, test scripts).
>
> The appended shortlog gives a flavor of the details.
>
> Linus
>
> ---
> Kirill A. Shutemov (3):
> mm: introduce vma_init()
> mm: use vma_init() to initialize VMAs on stack and data segments
> mm: fix vma_is_anonymous() false-positives
Hi, I have run into AOSP userspace crash with v4.18-rc7, leading to
above mm patches. bfd40eaff5ab ("mm: fix vma_is_anonymous()
false-positives") to be specific. The same userspace is working fine
with v4.18-rc6.
I didn't yet look into what is going wrong from userspace point of
view, but I just wanted to give you a heads up on this. I'll be happy
to assist in further debugging/diagnosis if required.
Here is the crash log from logcat, if it helps:
F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
F DEBUG : Build fingerprint:
'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key
F DEBUG : Revision: '0'
F DEBUG : ABI: 'arm'
F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<<
F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
.. <snip> ..
F DEBUG : backtrace:
F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48)
F DEBUG : #01 pc 0010c513 /system/lib/libart.so
(create_mspace_with_base+82)
F DEBUG : #02 pc 0015c601 /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int,
unsigned int)+40)
F DEBUG : #03 pc 0015c3ed /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*,
std::__1::basic_string<char, std::__
1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int,
unsigned int, unsigned int, unsigned int, bool)+36)
F DEBUG : #04 pc 0013c9ab /system/lib/libart.so
(art::gc::Heap::Heap(unsigned int, unsigned int, unsigned int,
unsigned int, double, double, unsigned int, unsigned int,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char>> const&, art::InstructionSet,
art::gc::CollectorType, art::gc::CollectorType,
art::gc::space::LargeObjectSpaceType, unsigned int, unsigned int,
unsigned int, bool, unsigned int, unsigned int, bool, bool, bool,
bool, bool, bool, bool, bool, bool, bool, bool, unsigned long
long)+1674)
DEBUG : #05 pc 00318201 /system/lib/libart.so
(art::Runtime::Init(art::RuntimeArgumentMap&&)+7036)
DEBUG : #06 pc 0031af19 /system/lib/libart.so
(art::Runtime::Create(std::__1::vector<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>>, void
const*>>> const&, bool)+68)
F DEBUG : #07 pc 0023c353 /system/lib/libart.so (JNI_CreateJavaVM+658)
F DEBUG : #08 pc 0000205f /system/lib/libandroid_runtime.so
(android::AndroidRuntime::startVm(_JavaVM**, _JNIEnv**, bool)+5038)
F DEBUG : #09 pc 00002381 /system/lib/libandroid_runtime.so
(android::AndroidRuntime::start(char const*,
android::Vector<android::String8> const&, bool)+196)
F DEBUG : #10 pc 0000046b /system/bin/app_process32 (main+702)
Regards,
Amit Pundir
On Mon, Jul 30, 2018 at 12:17:46PM +0530, Amit Pundir wrote:
> On Mon, 30 Jul 2018 at 03:39, Linus Torvalds
> <[email protected]> wrote:
> >
> > So unless something odd happens, this should be the last rc for 4.18.
> >
> > Nothing particularly odd happened this last week - we got the usual
> > random set of various minor fixes all over. About two thirds of it is
> > drivers - networking, staging and usb stands out, but there's a little
> > bit of stuff all over (clk, block, gpu, nvme..).
> >
> > Outside of drivers, the bulk is some core networking stuff, with
> > random changes elsewhere (minor arch updates, filesystems, core
> > kernel, test scripts).
> >
> > The appended shortlog gives a flavor of the details.
> >
> > Linus
> >
> > ---
> > Kirill A. Shutemov (3):
> > mm: introduce vma_init()
> > mm: use vma_init() to initialize VMAs on stack and data segments
> > mm: fix vma_is_anonymous() false-positives
>
> Hi, I have run into AOSP userspace crash with v4.18-rc7, leading to
> above mm patches. bfd40eaff5ab ("mm: fix vma_is_anonymous()
> false-positives") to be specific. The same userspace is working fine
> with v4.18-rc6.
>
> I didn't yet look into what is going wrong from userspace point of
> view, but I just wanted to give you a heads up on this. I'll be happy
> to assist in further debugging/diagnosis if required.
Youling reported basically the same bug with zygote crashing, but on
x86-64.
I think I missed vma_set_anonymous() somewhere, but I fail to see where.
Could you check if removing 'vma->vm_ops = &dummy_vm_ops;" from vma_init
makes the problem go away?
Any chance the code that crashes can be run under strace?
> Here is the crash log from logcat, if it helps:
> F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
> F DEBUG : Build fingerprint:
> 'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key
> F DEBUG : Revision: '0'
> F DEBUG : ABI: 'arm'
> F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<<
> F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
> .. <snip> ..
> F DEBUG : backtrace:
> F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48)
> F DEBUG : #01 pc 0010c513 /system/lib/libart.so
> (create_mspace_with_base+82)
> F DEBUG : #02 pc 0015c601 /system/lib/libart.so
> (art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int,
> unsigned int)+40)
> F DEBUG : #03 pc 0015c3ed /system/lib/libart.so
> (art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*,
> std::__1::basic_string<char, std::__
> 1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int,
> unsigned int, unsigned int, unsigned int, bool)+36)
> F DEBUG : #04 pc 0013c9ab /system/lib/libart.so
> (art::gc::Heap::Heap(unsigned int, unsigned int, unsigned int,
> unsigned int, double, double, unsigned int, unsigned int,
> std::__1::basic_string<char, std::__1::char_traits<char>,
> std::__1::allocator<char>> const&, art::InstructionSet,
> art::gc::CollectorType, art::gc::CollectorType,
> art::gc::space::LargeObjectSpaceType, unsigned int, unsigned int,
> unsigned int, bool, unsigned int, unsigned int, bool, bool, bool,
> bool, bool, bool, bool, bool, bool, bool, bool, unsigned long
> long)+1674)
> DEBUG : #05 pc 00318201 /system/lib/libart.so
> (art::Runtime::Init(art::RuntimeArgumentMap&&)+7036)
> DEBUG : #06 pc 0031af19 /system/lib/libart.so
> (art::Runtime::Create(std::__1::vector<std::__1::pair<std::__1::basic_string<char,
> std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>,
> std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
> std::__1::char_traits<char>, std::__1::allocator<char>>, void
> const*>>> const&, bool)+68)
> F DEBUG : #07 pc 0023c353 /system/lib/libart.so (JNI_CreateJavaVM+658)
> F DEBUG : #08 pc 0000205f /system/lib/libandroid_runtime.so
> (android::AndroidRuntime::startVm(_JavaVM**, _JNIEnv**, bool)+5038)
> F DEBUG : #09 pc 00002381 /system/lib/libandroid_runtime.so
> (android::AndroidRuntime::start(char const*,
> android::Vector<android::String8> const&, bool)+196)
> F DEBUG : #10 pc 0000046b /system/bin/app_process32 (main+702)
>
> Regards,
> Amit Pundir
>
--
Kirill A. Shutemov
On Sun, Jul 29, 2018 at 03:09:05PM -0700, Linus Torvalds wrote:
> So unless something odd happens, this should be the last rc for 4.18.
>
> Nothing particularly odd happened this last week - we got the usual
> random set of various minor fixes all over. About two thirds of it is
> drivers - networking, staging and usb stands out, but there's a little
> bit of stuff all over (clk, block, gpu, nvme..).
>
> Outside of drivers, the bulk is some core networking stuff, with
> random changes elsewhere (minor arch updates, filesystems, core
> kernel, test scripts).
>
Build results:
total: 134 pass: 134 fail: 0
Qemu test results:
total: 183 pass: 183 fail: 0
Fetails are available at http://kerneltests.org/builders/.
Guenter
On Mon, 30 Jul 2018 at 18:31, Kirill A. Shutemov <[email protected]> wrote:
>
> On Mon, Jul 30, 2018 at 12:17:46PM +0530, Amit Pundir wrote:
> > On Mon, 30 Jul 2018 at 03:39, Linus Torvalds
> > <[email protected]> wrote:
> > >
> > > So unless something odd happens, this should be the last rc for 4.18.
> > >
> > > Nothing particularly odd happened this last week - we got the usual
> > > random set of various minor fixes all over. About two thirds of it is
> > > drivers - networking, staging and usb stands out, but there's a little
> > > bit of stuff all over (clk, block, gpu, nvme..).
> > >
> > > Outside of drivers, the bulk is some core networking stuff, with
> > > random changes elsewhere (minor arch updates, filesystems, core
> > > kernel, test scripts).
> > >
> > > The appended shortlog gives a flavor of the details.
> > >
> > > Linus
> > >
> > > ---
> > > Kirill A. Shutemov (3):
> > > mm: introduce vma_init()
> > > mm: use vma_init() to initialize VMAs on stack and data segments
> > > mm: fix vma_is_anonymous() false-positives
> >
> > Hi, I have run into AOSP userspace crash with v4.18-rc7, leading to
> > above mm patches. bfd40eaff5ab ("mm: fix vma_is_anonymous()
> > false-positives") to be specific. The same userspace is working fine
> > with v4.18-rc6.
> >
> > I didn't yet look into what is going wrong from userspace point of
> > view, but I just wanted to give you a heads up on this. I'll be happy
> > to assist in further debugging/diagnosis if required.
>
> Youling reported basically the same bug with zygote crashing, but on
> x86-64.
>
> I think I missed vma_set_anonymous() somewhere, but I fail to see where.
>
> Could you check if removing 'vma->vm_ops = &dummy_vm_ops;" from vma_init
> makes the problem go away?
Yes removing 'vma->vm_ops = &dummy_vm_ops;" from vma_init() works.
Crash is gone with that change.
>
> Any chance the code that crashes can be run under strace?
Running strace on zygote is going to be a pain. I can check logcat
again and see if any other relatively less complex process is crashing
with similar backtrace and try to run that with strace if that is
still required.
Regards,
Amit Pundir
>
> > Here is the crash log from logcat, if it helps:
> > F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
> > F DEBUG : Build fingerprint:
> > 'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key
> > F DEBUG : Revision: '0'
> > F DEBUG : ABI: 'arm'
> > F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<<
> > F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
> > .. <snip> ..
> > F DEBUG : backtrace:
> > F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48)
> > F DEBUG : #01 pc 0010c513 /system/lib/libart.so
> > (create_mspace_with_base+82)
> > F DEBUG : #02 pc 0015c601 /system/lib/libart.so
> > (art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int,
> > unsigned int)+40)
> > F DEBUG : #03 pc 0015c3ed /system/lib/libart.so
> > (art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*,
> > std::__1::basic_string<char, std::__
> > 1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int,
> > unsigned int, unsigned int, unsigned int, bool)+36)
> > F DEBUG : #04 pc 0013c9ab /system/lib/libart.so
> > (art::gc::Heap::Heap(unsigned int, unsigned int, unsigned int,
> > unsigned int, double, double, unsigned int, unsigned int,
> > std::__1::basic_string<char, std::__1::char_traits<char>,
> > std::__1::allocator<char>> const&, art::InstructionSet,
> > art::gc::CollectorType, art::gc::CollectorType,
> > art::gc::space::LargeObjectSpaceType, unsigned int, unsigned int,
> > unsigned int, bool, unsigned int, unsigned int, bool, bool, bool,
> > bool, bool, bool, bool, bool, bool, bool, bool, unsigned long
> > long)+1674)
> > DEBUG : #05 pc 00318201 /system/lib/libart.so
> > (art::Runtime::Init(art::RuntimeArgumentMap&&)+7036)
> > DEBUG : #06 pc 0031af19 /system/lib/libart.so
> > (art::Runtime::Create(std::__1::vector<std::__1::pair<std::__1::basic_string<char,
> > std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>,
> > std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
> > std::__1::char_traits<char>, std::__1::allocator<char>>, void
> > const*>>> const&, bool)+68)
> > F DEBUG : #07 pc 0023c353 /system/lib/libart.so (JNI_CreateJavaVM+658)
> > F DEBUG : #08 pc 0000205f /system/lib/libandroid_runtime.so
> > (android::AndroidRuntime::startVm(_JavaVM**, _JNIEnv**, bool)+5038)
> > F DEBUG : #09 pc 00002381 /system/lib/libandroid_runtime.so
> > (android::AndroidRuntime::start(char const*,
> > android::Vector<android::String8> const&, bool)+196)
> > F DEBUG : #10 pc 0000046b /system/bin/app_process32 (main+702)
> >
> > Regards,
> > Amit Pundir
> >
>
> --
> Kirill A. Shutemov
On Mon, Jul 30, 2018 at 6:01 AM Kirill A. Shutemov <[email protected]> wrote:
>
> I think I missed vma_set_anonymous() somewhere, but I fail to see where.
Honestly, by now we just need to revert that commit.
It's not even clear that it was a good idea to begin with. The rest of
the commits were cleanups, this one was driven by a incorrect
VM_BUG_ON() that triggered, and that checked "vma_is_anonymous(vma)"
without any explanations of wht it should matter.
I think the biggest problem with vma_is_anonymous() may be its name,
not what it does.
What the code historically *did* (and what vma_is_anonymous() checks)
is not "is this anonymous", but rather "does this have any special
operations associated with it".
The two are similar. But people have grown opinions about exactly what
"anonymous" means. If we had named it just "no_vm_ops()", we wouldn't
have random crazy checks for "vma_is_anonymous()" in places where it
makes no sense.
So what I think we want a real explanation for is why people who use
"vma_is_anonymous()" care. Instead of trying to change its very
historical meaning, we should look at the users, and perhaps change
its name.
In this case, for example, I think the *real* problem was described by
commit 684283988f70 ("huge pagecache: mmap_sem is unlocked when
truncation splits pmd"), and the problem is that an existing check
that required that mmap_sem was held was changed to say "only for
anonymous mappings".
But the fact is, you can truncate mappings that don't have any ops just *fine*.
So maybe that original BUG() was entirely bogus to begin with, and it
shouldn't exist at all?
Or maybe the code should test "do I have a vm_file" instead of testing
"do I have vm_ops"?
What's the problem with just doing split_huge_pmd() there when it's a
pmd_trans_huge or pmd_devmap pmd? Why is that VM_BUG_ON_VMA() there in
the first place? Why are allegedly "anonymous" mappings so special
here for locking?
Adding a few more people to the cc, they were involved the last that
time VM_BUG_ON_VMA() was modified.
New people: see commit bfd40eaff5ab ("mm: fix vma_is_anonymous()
false-positives") for details. Right now I think it's getting
reverted, but the oops explanation in the commit is about that
kernel BUG at mm/memory.c:1422!
which was/is debatable and seems to make no sense (and definitely is
still triggerable despite that commit 684283988f70 ("huge pagecache:
mmap_sem is unlocked when truncation splits pmd") that limited it a
bit - but I think it didn't limit it enough.
Linus
On Mon, 30 Jul 2018, Linus Torvalds wrote:
> On Mon, Jul 30, 2018 at 6:01 AM Kirill A. Shutemov <[email protected]> wrote:
> >
> > I think I missed vma_set_anonymous() somewhere, but I fail to see where.
>
> Honestly, by now we just need to revert that commit.
>
> It's not even clear that it was a good idea to begin with. The rest of
> the commits were cleanups, this one was driven by a incorrect
> VM_BUG_ON() that triggered, and that checked "vma_is_anonymous(vma)"
> without any explanations of wht it should matter.
>
> I think the biggest problem with vma_is_anonymous() may be its name,
> not what it does.
>
> What the code historically *did* (and what vma_is_anonymous() checks)
> is not "is this anonymous", but rather "does this have any special
> operations associated with it".
>
> The two are similar. But people have grown opinions about exactly what
> "anonymous" means. If we had named it just "no_vm_ops()", we wouldn't
> have random crazy checks for "vma_is_anonymous()" in places where it
> makes no sense.
>
> So what I think we want a real explanation for is why people who use
> "vma_is_anonymous()" care. Instead of trying to change its very
> historical meaning, we should look at the users, and perhaps change
> its name.
You make a very good point on the naming. Especially confusing when
layered on top of "we call shmem 'file' here, but 'anon' there".
I have no problem with reverting -rc7's vma_is_anonymous() series.
>
> In this case, for example, I think the *real* problem was described by
> commit 684283988f70 ("huge pagecache: mmap_sem is unlocked when
> truncation splits pmd"), and the problem is that an existing check
> that required that mmap_sem was held was changed to say "only for
> anonymous mappings".
>
> But the fact is, you can truncate mappings that don't have any ops just *fine*.
>
> So maybe that original BUG() was entirely bogus to begin with, and it
> shouldn't exist at all?
>
> Or maybe the code should test "do I have a vm_file" instead of testing
> "do I have vm_ops"?
>
> What's the problem with just doing split_huge_pmd() there when it's a
> pmd_trans_huge or pmd_devmap pmd? Why is that VM_BUG_ON_VMA() there in
> the first place? Why are allegedly "anonymous" mappings so special
> here for locking?
>
> Adding a few more people to the cc, they were involved the last that
> time VM_BUG_ON_VMA() was modified.
>
> New people: see commit bfd40eaff5ab ("mm: fix vma_is_anonymous()
> false-positives") for details. Right now I think it's getting
> reverted, but the oops explanation in the commit is about that
>
> kernel BUG at mm/memory.c:1422!
>
> which was/is debatable and seems to make no sense (and definitely is
> still triggerable despite that commit 684283988f70 ("huge pagecache:
> mmap_sem is unlocked when truncation splits pmd") that limited it a
> bit - but I think it didn't limit it enough.
I'm all for deleting that VM_BUG_ON_VMA() in zap_pmd_range(), it was
just a compromise with those who wanted to keep something there;
I don't think we even need a WARN_ON_ONCE() now.
It's historical: back in the day when only (hugetlbfs which never
gets there, and) anon THP used huge pmds for userspace, it did half-
document an obscure assumption underlying __split_huge_pmd(), and
IIRC was added in response to a trinity bug catch in that area.
(It remains quite interesting how exit_mmap() does not come that way,
and most syscalls split the vma beforehand in vma_adjust(): it's mostly
about madvise(,,MADV_DONTNEED), perhaps others now, which zap ptes
without prior splitting.)
Once DAX and huge tmpfs came in, the BUG had to be scrapped or weakened
because truncation and hole-punching can come that way without mmap_sem.
And perhaps other drivers since are taking advantage of huge pmds now,
with other assumptions.
But I have not yet taken a look to see where the -rc7 changes actually
go wrong: I'll spend a little while looking, but don't expect to find it -
certainly don't wait on me, I'll only speak up if I find something.
Hugh
On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
>
> I have no problem with reverting -rc7's vma_is_anonymous() series.
I don't think we need to revert the whole series: I think the rest are
all fairly obvious cleanups, and shouldn't really have any semantic
changes.
It's literally only that last patch in the series that then changes
that meaning of "vm_ops". And I don't really _mind_ that last step
either, but since we don't know exactly what it was that it broke, and
we're past rc7, I don't think we really have any option but the revert
it.
And if we revert it, I think we need to just remove the
VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
it is quite likely that the real bug is that overzealous BUG_ON(),
since I can't see any reason why anonymous mappings should be special
there.
But I'm certainly also ok with re-visiting that commit later. I just
think that right _now_ the above is my preferred plan.
Kirill?
> I'm all for deleting that VM_BUG_ON_VMA() in zap_pmd_range(), it was
> just a compromise with those who wanted to keep something there;
> I don't think we even need a WARN_ON_ONCE() now.
So to me it looks like a historical check that simply doesn't
"normally" trigger, but there's no reason I can see why we should care
about the case it tests against.
> (It remains quite interesting how exit_mmap() does not come that way,
> and most syscalls split the vma beforehand in vma_adjust(): it's mostly
> about madvise(,,MADV_DONTNEED), perhaps others now, which zap ptes
> without prior splitting.)
Well, in this case it's the ftruncate() path, which fundamentally
doesn't split the vma at all (prior *or* later). But yes, madvise() is
in the same boat - it doesn't change the vma at all, it just changes
the contents of the vma.
And exit_mmap() is special because it just tears down everything.
So we do have three very distinct cases:
(a) changing and thus splitting the vma itself (mprotect, munmap/mmap, mlock),
(b) not changing the vma, but changing the underlying mapping
(truncate and madvise(MADV_DONTNEED)
(c) tearing down everything, and no locking needed because it's the
last user (exit_mmap).
that are different for what I think are good reasons.
Linus
On Mon, 30 Jul 2018, Linus Torvalds wrote:
> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
> >
> > I have no problem with reverting -rc7's vma_is_anonymous() series.
>
> I don't think we need to revert the whole series: I think the rest are
> all fairly obvious cleanups, and shouldn't really have any semantic
> changes.
Okay.
>
> It's literally only that last patch in the series that then changes
> that meaning of "vm_ops". And I don't really _mind_ that last step
> either, but since we don't know exactly what it was that it broke, and
> we're past rc7, I don't think we really have any option but the revert
> it.
It took me a long time to grasp what was happening, that that last
patch bfd40eaff5ab was fixing. Not quite explained in the commit.
I think it was that by mistakenly passing the vma_is_anonymous() test,
create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
COWing pages from kcov); which the truncate then had to split, and in
going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
>
> And if we revert it, I think we need to just remove the
> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
> it is quite likely that the real bug is that overzealous BUG_ON(),
> since I can't see any reason why anonymous mappings should be special
> there.
Yes, that probably has to go: but it's not clear what state it leaves
us in, with an anon THP being split by a truncate, without the expected
locking; I don't remember offhand, probably a subtler bug than that BUG,
which you may or may not consider an improvement.
I fear that Kirill has not missed inserting a vma_set_anonymous() from
somewhere that it should be, but rather that zygote is working with some
special mapping which used to satisfy vma_is_anonymous(), faults supplying
backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
finds !dummy_vm_ops.fault hence SIGBUS.
If that's so, perhaps dummy_vm_ops needs to be given a back-compatible
fault handler; or the driver(?) in question given vm_ops and that fault
handler. But when I say "back-compatible", I don't think it should ever
go so far as to supply a THP.
>
> But I'm certainly also ok with re-visiting that commit later. I just
> think that right _now_ the above is my preferred plan.
>
> Kirill?
>
> > I'm all for deleting that VM_BUG_ON_VMA() in zap_pmd_range(), it was
> > just a compromise with those who wanted to keep something there;
> > I don't think we even need a WARN_ON_ONCE() now.
>
> So to me it looks like a historical check that simply doesn't
> "normally" trigger, but there's no reason I can see why we should care
> about the case it tests against.
>
> > (It remains quite interesting how exit_mmap() does not come that way,
> > and most syscalls split the vma beforehand in vma_adjust(): it's mostly
> > about madvise(,,MADV_DONTNEED), perhaps others now, which zap ptes
> > without prior splitting.)
>
> Well, in this case it's the ftruncate() path, which fundamentally
> doesn't split the vma at all (prior *or* later). But yes, madvise() is
> in the same boat - it doesn't change the vma at all, it just changes
> the contents of the vma.
>
> And exit_mmap() is special because it just tears down everything.
>
> So we do have three very distinct cases:
>
> (a) changing and thus splitting the vma itself (mprotect, munmap/mmap, mlock),
Yes.
>
> (b) not changing the vma, but changing the underlying mapping
> (truncate and madvise(MADV_DONTNEED)
Yes, though I think I would distinguish the truncate & hole-punch case
from the madvise zap case, they have different characteristics in other
ways (or did, before that awkward case of truncating an anon THP surfaced).
>
> (c) tearing down everything, and no locking needed because it's the
> last user (exit_mmap).
Yes; and it goes linearly from start to finish, never jumping into
the middle of a vma, so never needing to split a THP.
>
> that are different for what I think are good reasons.
>
> Linus
On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins <[email protected]> wrote:
> On Mon, 30 Jul 2018, Linus Torvalds wrote:
>> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
>> >
>> > I have no problem with reverting -rc7's vma_is_anonymous() series.
>>
>> I don't think we need to revert the whole series: I think the rest are
>> all fairly obvious cleanups, and shouldn't really have any semantic
>> changes.
>
> Okay.
>
>>
>> It's literally only that last patch in the series that then changes
>> that meaning of "vm_ops". And I don't really _mind_ that last step
>> either, but since we don't know exactly what it was that it broke, and
>> we're past rc7, I don't think we really have any option but the revert
>> it.
>
> It took me a long time to grasp what was happening, that that last
> patch bfd40eaff5ab was fixing. Not quite explained in the commit.
>
> I think it was that by mistakenly passing the vma_is_anonymous() test,
> create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
> COWing pages from kcov); which the truncate then had to split, and in
> going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
>
>>
>> And if we revert it, I think we need to just remove the
>> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
>> it is quite likely that the real bug is that overzealous BUG_ON(),
>> since I can't see any reason why anonymous mappings should be special
>> there.
>
> Yes, that probably has to go: but it's not clear what state it leaves
> us in, with an anon THP being split by a truncate, without the expected
> locking; I don't remember offhand, probably a subtler bug than that BUG,
> which you may or may not consider an improvement.
>
> I fear that Kirill has not missed inserting a vma_set_anonymous() from
> somewhere that it should be, but rather that zygote is working with some
> special mapping which used to satisfy vma_is_anonymous(), faults supplying
> backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
> finds !dummy_vm_ops.fault hence SIGBUS.
I've been only casually following this thread (mostly just glad Amit
caught it and I could avoid having to bisect the issue in my own
Android testing), but this bit starting to shake a few old cobwebs
loose in my brain.
I'm wondering if Zygote is utilizing ashmem here, and we're somehow
traversing ashmem purged memory, or due to some setup issue the
initial traverse isn't being zero-filled as expected?
ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup()
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377
If we purge pages, it punches it out with:
vfs_fallocate(range->asma->file,
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
start, end - start);
here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447
But in ashmem_pin(), we don't do anything other then returning if we
purged any page in the range.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577
And I believe the future assumption is the if we traverse those pages
they will be zero filled (if purged or even during the initial
traversal after mmap)
Its been a long time, and I've not looked at the code in question but
it sounds from Hugh's comments above that we might instead get a
SIGBUS here.
Looking more at the problematic patch..
Amit: Does adding something like (whitespace damaged, apologies):
index a1a0025..1af6915 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct
vm_area_struct *vma)
fput(asma->file);
goto out;
}
- }
+ } else
+ vma_set_anonymous(vma);
if (vma->vm_file)
fput(vma->vm_file);
Seem to resolve it? (Sorry, I'd test it myself, but I'm away from my
desk for the night).
thanks
-john
On Mon, Jul 30, 2018 at 06:01:26PM -0700, Linus Torvalds wrote:
> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
> >
> > I have no problem with reverting -rc7's vma_is_anonymous() series.
>
> I don't think we need to revert the whole series: I think the rest are
> all fairly obvious cleanups, and shouldn't really have any semantic
> changes.
>
> It's literally only that last patch in the series that then changes
> that meaning of "vm_ops". And I don't really _mind_ that last step
> either, but since we don't know exactly what it was that it broke, and
> we're past rc7, I don't think we really have any option but the revert
> it.
>
> And if we revert it, I think we need to just remove the
> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
> it is quite likely that the real bug is that overzealous BUG_ON(),
> since I can't see any reason why anonymous mappings should be special
> there.
>
> But I'm certainly also ok with re-visiting that commit later. I just
> think that right _now_ the above is my preferred plan.
>
> Kirill?
Considering the timing, I'm okay with reverting the last patch with
dropping the VM_BUG_ON_VMA().
But in the end I would like to see strong vma_is_anonymous().
The VM_BUG_ON_VMA() is only triggerable by the test case because
vma_is_anonymous() false-positive in fault path and we get anon-THP
allocated in file-private mapping.
I don't see immediately how this may trigger other crashes.
But it definitely looks wrong.
> > I'm all for deleting that VM_BUG_ON_VMA() in zap_pmd_range(), it was
> > just a compromise with those who wanted to keep something there;
> > I don't think we even need a WARN_ON_ONCE() now.
>
> So to me it looks like a historical check that simply doesn't
> "normally" trigger, but there's no reason I can see why we should care
> about the case it tests against.
I'll think more on what could go wrong with __split_huge_pmd() called on
anon-THP page without mmap_sem(). It's not yet clear cut to me.
--
Kirill A. Shutemov
On Tue, 31 Jul 2018 at 09:55, John Stultz <[email protected]> wrote:
>
> On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins <[email protected]> wrote:
> > On Mon, 30 Jul 2018, Linus Torvalds wrote:
> >> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
> >> >
> >> > I have no problem with reverting -rc7's vma_is_anonymous() series.
> >>
> >> I don't think we need to revert the whole series: I think the rest are
> >> all fairly obvious cleanups, and shouldn't really have any semantic
> >> changes.
> >
> > Okay.
> >
> >>
> >> It's literally only that last patch in the series that then changes
> >> that meaning of "vm_ops". And I don't really _mind_ that last step
> >> either, but since we don't know exactly what it was that it broke, and
> >> we're past rc7, I don't think we really have any option but the revert
> >> it.
> >
> > It took me a long time to grasp what was happening, that that last
> > patch bfd40eaff5ab was fixing. Not quite explained in the commit.
> >
> > I think it was that by mistakenly passing the vma_is_anonymous() test,
> > create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
> > COWing pages from kcov); which the truncate then had to split, and in
> > going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
> >
> >>
> >> And if we revert it, I think we need to just remove the
> >> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
> >> it is quite likely that the real bug is that overzealous BUG_ON(),
> >> since I can't see any reason why anonymous mappings should be special
> >> there.
> >
> > Yes, that probably has to go: but it's not clear what state it leaves
> > us in, with an anon THP being split by a truncate, without the expected
> > locking; I don't remember offhand, probably a subtler bug than that BUG,
> > which you may or may not consider an improvement.
> >
> > I fear that Kirill has not missed inserting a vma_set_anonymous() from
> > somewhere that it should be, but rather that zygote is working with some
> > special mapping which used to satisfy vma_is_anonymous(), faults supplying
> > backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
> > finds !dummy_vm_ops.fault hence SIGBUS.
>
> I've been only casually following this thread (mostly just glad Amit
> caught it and I could avoid having to bisect the issue in my own
> Android testing), but this bit starting to shake a few old cobwebs
> loose in my brain.
>
> I'm wondering if Zygote is utilizing ashmem here, and we're somehow
> traversing ashmem purged memory, or due to some setup issue the
> initial traverse isn't being zero-filled as expected?
>
> ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup()
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377
>
>
> If we purge pages, it punches it out with:
> vfs_fallocate(range->asma->file,
> FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> start, end - start);
> here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447
>
> But in ashmem_pin(), we don't do anything other then returning if we
> purged any page in the range.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577
>
> And I believe the future assumption is the if we traverse those pages
> they will be zero filled (if purged or even during the initial
> traversal after mmap)
>
> Its been a long time, and I've not looked at the code in question but
> it sounds from Hugh's comments above that we might instead get a
> SIGBUS here.
>
> Looking more at the problematic patch..
> Amit: Does adding something like (whitespace damaged, apologies):
>
> index a1a0025..1af6915 100644
> --- a/drivers/staging/android/ashmem.c
> +++ b/drivers/staging/android/ashmem.c
> @@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct
> vm_area_struct *vma)
> fput(asma->file);
> goto out;
> }
> - }
> + } else
> + vma_set_anonymous(vma);
>
> if (vma->vm_file)
> fput(vma->vm_file);
>
This ashmem change ^^ worked too.
Regards,
Amit Pundir
>
> Seem to resolve it? (Sorry, I'd test it myself, but I'm away from my
> desk for the night).
> thanks
> -john
On Tue, Jul 31, 2018 at 12:10:06PM +0530, Amit Pundir wrote:
> On Tue, 31 Jul 2018 at 09:55, John Stultz <[email protected]> wrote:
> >
> > On Mon, Jul 30, 2018 at 8:26 PM, Hugh Dickins <[email protected]> wrote:
> > > On Mon, 30 Jul 2018, Linus Torvalds wrote:
> > >> On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
> > >> >
> > >> > I have no problem with reverting -rc7's vma_is_anonymous() series.
> > >>
> > >> I don't think we need to revert the whole series: I think the rest are
> > >> all fairly obvious cleanups, and shouldn't really have any semantic
> > >> changes.
> > >
> > > Okay.
> > >
> > >>
> > >> It's literally only that last patch in the series that then changes
> > >> that meaning of "vm_ops". And I don't really _mind_ that last step
> > >> either, but since we don't know exactly what it was that it broke, and
> > >> we're past rc7, I don't think we really have any option but the revert
> > >> it.
> > >
> > > It took me a long time to grasp what was happening, that that last
> > > patch bfd40eaff5ab was fixing. Not quite explained in the commit.
> > >
> > > I think it was that by mistakenly passing the vma_is_anonymous() test,
> > > create_huge_pmd() gave the MAP_PRIVATE kcov mapping a THP (instead of
> > > COWing pages from kcov); which the truncate then had to split, and in
> > > going to do so, again hit the mistaken vma_is_anonymous() test, BUG.
> > >
> > >>
> > >> And if we revert it, I think we need to just remove the
> > >> VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
> > >> it is quite likely that the real bug is that overzealous BUG_ON(),
> > >> since I can't see any reason why anonymous mappings should be special
> > >> there.
> > >
> > > Yes, that probably has to go: but it's not clear what state it leaves
> > > us in, with an anon THP being split by a truncate, without the expected
> > > locking; I don't remember offhand, probably a subtler bug than that BUG,
> > > which you may or may not consider an improvement.
> > >
> > > I fear that Kirill has not missed inserting a vma_set_anonymous() from
> > > somewhere that it should be, but rather that zygote is working with some
> > > special mapping which used to satisfy vma_is_anonymous(), faults supplying
> > > backing pages, but now comes out as !vma_is_anonymous(), so do_fault()
> > > finds !dummy_vm_ops.fault hence SIGBUS.
> >
> > I've been only casually following this thread (mostly just glad Amit
> > caught it and I could avoid having to bisect the issue in my own
> > Android testing), but this bit starting to shake a few old cobwebs
> > loose in my brain.
> >
> > I'm wondering if Zygote is utilizing ashmem here, and we're somehow
> > traversing ashmem purged memory, or due to some setup issue the
> > initial traverse isn't being zero-filled as expected?
> >
> > ashmem ranges are created using: shmem_file_setup() and shmem_zero_setup()
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n377
> >
> >
> > If we purge pages, it punches it out with:
> > vfs_fallocate(range->asma->file,
> > FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> > start, end - start);
> > here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n447
> >
> > But in ashmem_pin(), we don't do anything other then returning if we
> > purged any page in the range.
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/staging/android/ashmem.c#n577
> >
> > And I believe the future assumption is the if we traverse those pages
> > they will be zero filled (if purged or even during the initial
> > traversal after mmap)
> >
> > Its been a long time, and I've not looked at the code in question but
> > it sounds from Hugh's comments above that we might instead get a
> > SIGBUS here.
> >
> > Looking more at the problematic patch..
> > Amit: Does adding something like (whitespace damaged, apologies):
> >
> > index a1a0025..1af6915 100644
> > --- a/drivers/staging/android/ashmem.c
> > +++ b/drivers/staging/android/ashmem.c
> > @@ -402,7 +402,8 @@ static int ashmem_mmap(struct file *file, struct
> > vm_area_struct *vma)
> > fput(asma->file);
> > goto out;
> > }
> > - }
> > + } else
> > + vma_set_anonymous(vma);
> >
> > if (vma->vm_file)
> > fput(vma->vm_file);
> >
>
> This ashmem change ^^ worked too.
Okay. It makes sense.
But I'm not convinced that's a legitimate way to get an anonymous mapping.
I don't know how ashmem suppose to work. Looks like we get a shmem file
associated with the mapping, even if user asked for private mapping.
Shouldn't in this case vm_ops point to shmem_vm_ops?
Note, we have only one other case when MAP_PRIVATE on a file gets you an
anonymous mapping -- /dev/zero.
--
Kirill A. Shutemov
On Tue, Jul 31, 2018 at 09:29:27AM +0300, Kirill A. Shutemov wrote:
> On Mon, Jul 30, 2018 at 06:01:26PM -0700, Linus Torvalds wrote:
> > On Mon, Jul 30, 2018 at 2:53 PM Hugh Dickins <[email protected]> wrote:
> > >
> > > I have no problem with reverting -rc7's vma_is_anonymous() series.
> >
> > I don't think we need to revert the whole series: I think the rest are
> > all fairly obvious cleanups, and shouldn't really have any semantic
> > changes.
> >
> > It's literally only that last patch in the series that then changes
> > that meaning of "vm_ops". And I don't really _mind_ that last step
> > either, but since we don't know exactly what it was that it broke, and
> > we're past rc7, I don't think we really have any option but the revert
> > it.
> >
> > And if we revert it, I think we need to just remove the
> > VM_BUG_ON_VMA() that it was supposed to fix. Because I do think that
> > it is quite likely that the real bug is that overzealous BUG_ON(),
> > since I can't see any reason why anonymous mappings should be special
> > there.
> >
> > But I'm certainly also ok with re-visiting that commit later. I just
> > think that right _now_ the above is my preferred plan.
> >
> > Kirill?
>
> Considering the timing, I'm okay with reverting the last patch with
> dropping the VM_BUG_ON_VMA().
>
> But in the end I would like to see strong vma_is_anonymous().
>
> The VM_BUG_ON_VMA() is only triggerable by the test case because
> vma_is_anonymous() false-positive in fault path and we get anon-THP
> allocated in file-private mapping.
>
> I don't see immediately how this may trigger other crashes.
> But it definitely looks wrong.
>
> > > I'm all for deleting that VM_BUG_ON_VMA() in zap_pmd_range(), it was
> > > just a compromise with those who wanted to keep something there;
> > > I don't think we even need a WARN_ON_ONCE() now.
> >
> > So to me it looks like a historical check that simply doesn't
> > "normally" trigger, but there's no reason I can see why we should care
> > about the case it tests against.
>
> I'll think more on what could go wrong with __split_huge_pmd() called on
> anon-THP page without mmap_sem(). It's not yet clear cut to me.
I think not having mmap_sem taken at least on read when we call
__split_huge_pmd() opens possiblity of race with khugepaged:
khugepaged can collapse the page back to THP as soon as we drop ptl.
As result pmd_none_or_trans_huge_or_clear_bad() would return true and we
basically leave the THP behind, not zapped.
--
Kirill A. Shutemov
On Mon, Jul 30, 2018 at 11:40 PM Amit Pundir <[email protected]> wrote:
>
> This ashmem change ^^ worked too.
Ok, let's go for that one and hope it's the only one.
John, can I get a proper commit message and sign-off for that ashmem change?
Kirill - you mentioned that somebody reproduced a problem on x86-64
too. I didn't see that report. Was that some odd x86 Android setup
with Ashmem too, or is there something else pending?
Linus
On Tue, Jul 31, 2018 at 9:29 AM, Linus Torvalds
<[email protected]> wrote:
> On Mon, Jul 30, 2018 at 11:40 PM Amit Pundir <[email protected]> wrote:
>>
>> This ashmem change ^^ worked too.
>
> Ok, let's go for that one and hope it's the only one.
>
> John, can I get a proper commit message and sign-off for that ashmem change?
Will do. Just doing some local testing myself to make sure all is well.
> Kirill - you mentioned that somebody reproduced a problem on x86-64
> too. I didn't see that report. Was that some odd x86 Android setup
> with Ashmem too, or is there something else pending?
Krill mentioned "zygote crashing, but on x86-64" and zygote is Android
so I assume it is the same issue.
thanks
-john
On Tue, Jul 31, 2018 at 09:29:22AM -0700, Linus Torvalds wrote:
> On Mon, Jul 30, 2018 at 11:40 PM Amit Pundir <[email protected]> wrote:
> >
> > This ashmem change ^^ worked too.
>
> Ok, let's go for that one and hope it's the only one.
>
> John, can I get a proper commit message and sign-off for that ashmem change?
>
> Kirill - you mentioned that somebody reproduced a problem on x86-64
> too. I didn't see that report. Was that some odd x86 Android setup
> with Ashmem too, or is there something else pending?
I've got report from youling privately:
"mm: fix vma_is_anonymous() false-positives" cause my userspace boot failed,
our Androidx86 userspace can running on linux mainline kernel,
revert it boot succeed with 4.18rc7 kernel.
"mm: fix vma_is_anonymous() false-positives" cause these
07-30 11:04:19.556 1609 1609 F DEBUG : pid: 1304, tid: 1304, name: zygote
>>> zygote <<<
07-30 11:04:19.556 1609 1609 F DEBUG : signal 7 (SIGBUS), code 2
(BUS_ADRERR), fault addr 0x7494d008
07-30 11:04:19.556 1609 1609 F DEBUG : eax 00000000 ebx f337bb68 ecx
000001e0 edx 7494d008
07-30 11:04:19.556 1609 1609 F DEBUG : esi 7494d000 edi 00000000
07-30 11:04:19.556 1609 1609 F DEBUG : xcs 00000023 xds 0000002b xes
0000002b xfs 00000003 xss 0000002b
07-30 11:04:19.556 1609 1609 F DEBUG : eip f40f5c76 ebp ffa8d288 esp
ffa8d238 flags 00010202
07-30 11:04:19.581 1609 1609 F DEBUG :
-------------------------------------------------------------------------
The report also had screenshot attached about system info. It's a Baytrail
tablet with LinageOS, so I believe it's the same issue.
But it's not the only issue unfortunately. Tony reported issue with
booting ia64 with the patch. I have no clue why. I rechecked everything
ia64-specific and looks fine to me. :-/
--
Kirill A. Shutemov
Amit Pundir and Youling in parallel reported crashes with recent
mainline kernels running Android:
F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
F DEBUG : Build fingerprint:
'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key
F DEBUG : Revision: '0'
F DEBUG : ABI: 'arm'
F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<<
F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
... <snip> ...
F DEBUG : backtrace:
F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48)
F DEBUG : #01 pc 0010c513 /system/lib/libart.so
(create_mspace_with_base+82)
F DEBUG : #02 pc 0015c601 /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int,
unsigned int)+40)
F DEBUG : #03 pc 0015c3ed /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*,
std::__1::basic_string<char, std::__
1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int,
unsigned int, unsigned int, unsigned int, bool)+36)
...
This was bisected back to commit bfd40eaff5ab ("mm: fix vma_is_anonymous()
false-positives").
create_mspace_with_base() in the trace above, utilizes ashmem, and
with ashmem, for shared mappings we use shmem_zero_setup(), which sets
the vma->vm_ops to &shmem_vm_ops. But for private ashmem mappings
nothing sets the vma->vm_ops.
Looking at the problematic patch, it seems to add a requirement that
one call vma_set_anonymous() on a vma, otherwise the dummy_vm_ops will
be used. Using the dummy_vm_ops seem to triggger SIGBUS when traversing
unmapped pages.
Thus, this patch adds a call to vma_set_anonymous() for ashmem private
mappings and seems to avoid the reported problem.
Cc: Amit Pundir <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: [email protected]
Cc: youling 257 <[email protected]>
Fixes: bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives")
Reported-by: Amit Pundir <[email protected]>
Reported-by: Youling 257 <[email protected]>
Signed-off-by: John Stultz <[email protected]>
---
Hopefully my explanation make sense here. Please let me know if it
needs corrections.
thanks
-john
---
drivers/staging/android/ashmem.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c
index a1a0025..d5d33e1 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -402,6 +402,8 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma)
fput(asma->file);
goto out;
}
+ } else {
+ vma_set_anonymous(vma);
}
if (vma->vm_file)
--
2.7.4
On Tue, Jul 31, 2018 at 08:03:28PM +0300, Kirill A. Shutemov wrote:
> But it's not the only issue unfortunately. Tony reported issue with
> booting ia64 with the patch. I have no clue why. I rechecked everything
> ia64-specific and looks fine to me. :-/
If I just revert bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives")
then ia64 boots again.
-Tony
On Tue, Jul 31, 2018 at 10:43 AM Luck, Tony <[email protected]> wrote:
>
> If I just revert bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives")
> then ia64 boots again.
Ok, so it's not just the ashmem thing.
I think I'll do an rc8 with the revert, just so that we'll have some
time to figure this out. It's only Tuesday, but I already have 90
commits since rc7, so this isn't the only issue we're having.
I _prefer_ just the regular cadence of releases, but when I have a
reason to delay, I'll delay.
Linus
Build fingerprint:
'Android-x86/android_x86/x86:8.1.0/OPM6.171019.030.B1/cwhuang0618:userdebug/test-keys'
Revision: '0'
ABI: 'x86'
pid: 2899, tid: 2899, name: zygote >>> zygote <<<
signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
eax 00000000 ebx f0274a40 ecx 000001e0 edx 0ec00008
esi 00000000 edi 0ec00000
xcs 00000023 xds 0000002b xes 0000002b xfs 00000003 xss 0000002b
eip f24a4996 ebp ffc4eaa8 esp ffc4ea68 flags 00010202
backtrace:
#00 pc 0001a996 /system/lib/libc.so (memset+150)
#01 pc 0022b2be /system/lib/libart.so (create_mspace_with_base+222)
#02 pc 002b2a05 /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int, unsigned
int)+69)
#03 pc 002b2637 /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char>> const&, unsigned int, unsigned int, unsigned
int, unsigned int, bool)+55)
#04 pc 0027f6af /system/lib/libart.so (art::gc::Heap::Heap(unsigned
int, unsigned int, unsigned int, unsigned int, double, double, unsigned
int, unsigned int, std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>> const&,
art::InstructionSet, art::gc::CollectorType, art::gc::CollectorType,
art::gc::space::LargeObjectSpaceType, unsigned int, unsigned int, unsigned
int, bool, unsigned int, unsigned int, bool, bool, bool, bool, bool, bool,
bool, bool, bool, bool, bool, unsig #05 pc 0055a17a /system/lib/libart.so
(_ZN3art7Runtime4InitEONS_18RuntimeArgumentMapE+11434)
#06 pc 0055e928 /system/lib/libart.so
(art::Runtime::Create(std::__1::vector<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>>>
const&, bool)+184)
2018-08-01 0:29 GMT+08:00 Linus Torvalds <[email protected]>:
> On Mon, Jul 30, 2018 at 11:40 PM Amit Pundir <[email protected]>
> wrote:
> >
> > This ashmem change ^^ worked too.
>
> Ok, let's go for that one and hope it's the only one.
>
> John, can I get a proper commit message and sign-off for that ashmem
> change?
>
> Kirill - you mentioned that somebody reproduced a problem on x86-64
> too. I didn't see that report. Was that some odd x86 Android setup
> with Ashmem too, or is there something else pending?
>
> Linus
>
my x86 report
isPrevious: true
Build:
Android-x86/android_x86/x86:8.1.0/OPM6.171019.030.B1/cwhuang0618:userdebug/test-keys
Hardware: unknown
Revision: 0
Bootloader: unknown
Radio: unknown
Kernel: Linux version 4.18.0-rc7-android-x86_64+ (root@localhost) (gcc
version 8.2.0 (Ubuntu 8.2.0-1ubuntu2)) #1 SMP PREEMPT Mon Jul 30 12:26:29
CST 2018
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint:
'Android-x86/android_x86/x86:8.1.0/OPM6.171019.030.B1/cwhuang0618:userdebug/test-keys'
Revision: '0'
ABI: 'x86'
pid: 2899, tid: 2899, name: zygote >>> zygote <<<
signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008
eax 00000000 ebx f0274a40 ecx 000001e0 edx 0ec00008
esi 00000000 edi 0ec00000
xcs 00000023 xds 0000002b xes 0000002b xfs 00000003 xss 0000002b
eip f24a4996 ebp ffc4eaa8 esp ffc4ea68 flags 00010202
backtrace:
#00 pc 0001a996 /system/lib/libc.so (memset+150)
#01 pc 0022b2be /system/lib/libart.so (create_mspace_with_base+222)
#02 pc 002b2a05 /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int, unsigned
int)+69)
#03 pc 002b2637 /system/lib/libart.so
(art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*,
std::__1::basic_string<char, std::__1::char_traits<char>,
std::__1::allocator<char>> const&, unsigned int, unsigned int, unsigned
int, unsigned int, bool)+55)
#04 pc 0027f6af /system/lib/libart.so (art::gc::Heap::Heap(unsigned
int, unsigned int, unsigned int, unsigned int, double, double, unsigned
int, unsigned int, std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>> const&,
art::InstructionSet, art::gc::CollectorType, art::gc::CollectorType,
art::gc::space::LargeObjectSpaceType, unsigned int, unsigned int, unsigned
int, bool, unsigned int, unsigned int, bool, bool, bool, bool, bool, bool,
bool, bool, bool, bool, bool, unsig #05 pc 0055a17a /system/lib/libart.so
(_ZN3art7Runtime4InitEONS_18RuntimeArgumentMapE+11434)
#06 pc 0055e928 /system/lib/libart.so
(art::Runtime::Create(std::__1::vector<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>,
std::__1::allocator<std::__1::pair<std::__1::basic_string<char,
std::__1::char_traits<char>, std::__1::allocator<char>>, void const*>>>
const&, bool)+184)
#07 pc 003a1117 /system/lib/libart.so (JNI_CreateJavaVM+647)
#08 pc 000044ae /system/lib/libnativehelper.so (JNI_CreateJavaVM+46)
#09 pc 00078d1d /system/lib/libandroid_runtime.so
(android::AndroidRuntime::startVm(_JavaVM**, _JNIEnv**, bool)+7581)
#10 pc 000791fb /system/lib/libandroid_runtime.so
(android::AndroidRuntime::start(char const*,
android::Vector<android::String8> const&, bool)+395)
#11 pc 00003119 /system/bin/app_process32 (main+1689)
#12 pc 000b6a34 /system/lib/libc.so (__libc_init+100)
#13 pc 000029dd /system/bin/app_process32 (_start_main+80)
#14 pc 000029e8 /system/bin/app_process32 (_start+10)
#15 pc 00000004 <unknown>
#16 pc 00020b69 [stack:ffc31000]
2018-08-01 0:29 GMT+08:00 Linus Torvalds <[email protected]>:
> On Mon, Jul 30, 2018 at 11:40 PM Amit Pundir <[email protected]>
> wrote:
> >
> > This ashmem change ^^ worked too.
>
> Ok, let's go for that one and hope it's the only one.
>
> John, can I get a proper commit message and sign-off for that ashmem
> change?
>
> Kirill - you mentioned that somebody reproduced a problem on x86-64
> too. I didn't see that report. Was that some odd x86 Android setup
> with Ashmem too, or is there something else pending?
>
> Linus
>
On Tue, 31 Jul 2018, Kirill A. Shutemov wrote:
> On Tue, Jul 31, 2018 at 09:29:27AM +0300, Kirill A. Shutemov wrote:
> > On Mon, Jul 30, 2018 at 06:01:26PM -0700, Linus Torvalds wrote:
> > >
> > > So to me it looks like a historical check that simply doesn't
> > > "normally" trigger, but there's no reason I can see why we should care
> > > about the case it tests against.
> >
> > I'll think more on what could go wrong with __split_huge_pmd() called on
> > anon-THP page without mmap_sem(). It's not yet clear cut to me.
>
> I think not having mmap_sem taken at least on read when we call
> __split_huge_pmd() opens possiblity of race with khugepaged:
> khugepaged can collapse the page back to THP as soon as we drop ptl.
> As result pmd_none_or_trans_huge_or_clear_bad() would return true and we
> basically leave the THP behind, not zapped.
I think we don't care deeply about the POSIX truncate semantics on the
kind of "file" that has managed to get to this point: in the unlikely
event that a THP is immediately recreated there, never mind, so long as
we don't crash or leak memory or suchlike (the surplus THP would get
freed at exit).
I think we're altogether better off just deleting that VM_BUG_ON_VMA();
but I do find it very very hard to arrive at a firm conclusion on the
absolute safety of splitting a pmd without mmap_sem there (though any
problem unlikely even if real, and more likely a figment of my paranoia).
I believe the VM_BUG_ON is a relic from the old days, when anon_vma_lock
played a big part in guarding the pmd+page split: remember how mmap_sem
is one of the ways you can guarantee that the anon_vma will not vanish
beneath you (page_get_anon_vma was added later than anon THP).
I'm a little more worried by the nearby zap_huge_pmd() (which could
never be covered by a suitable VM_BUG_ON): the way that frees a
previously deposited page table, and you have no guarantee of when
and where that page table was last used. Again I can't point to an
actual problem, just the recollection that it's been found subtly
safe in the past, but any change in the conditions might affect that.
And a little worried to see how split_huge_page_to_list() uses
anon_vma_lock on PageAnon versus i_mmap_lock on !PageAnon: which
makes complete sense in itself, but won't protect against a PageAnon
THP being concurrently split from the truncate_pagecache() direction,
where unmap_mapping_range() uses i_mmap_lock. (simple_setattr() the
default setattr: that's a bit of a worry too.)
I feel I'm moaning and crying at shadows, rather than providing any
useful suggestions or patches; but thought I ought to report back.
Hugh
On Tue, Jul 31, 2018 at 10:43 AM Luck, Tony <[email protected]> wrote:
>
> On Tue, Jul 31, 2018 at 08:03:28PM +0300, Kirill A. Shutemov wrote:
> > But it's not the only issue unfortunately. Tony reported issue with
> > booting ia64 with the patch. I have no clue why. I rechecked everything
> > ia64-specific and looks fine to me. :-/
>
> If I just revert bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives")
> then ia64 boots again.
Ok, I'd love to have more information about this, but I'm assuming
that Tony isn't running some odd ia64 version of Android, so there's
definitely something else than just the ashmem thing going on. Either
it's some odd ia64-specific special vma, or it's just something
triggered on an ia64 boot that nobody else noticed or cared about.
And I was just going to do the final revert and started this email to
say so, when I looked at the obvious candidate: the
ia64_init_addr_space() function. Trivially fixed.
But as I was doing that, I also noticed another problem with the vma
series: the vma_init() conversion of automatic variables is buggy.
Commit 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack
and data segments") is really bad, because it never grew the memset()
that was discussed, and the patch that was applied was the original
one - so vma_init() only initializes a couple of fields. As a result,
doing things like this:
- struct vm_area_struct vma = { .vm_mm = mm };
+ struct vm_area_struct vma;
+ vma_init(&vma, mm);
is just completely wrong, because it actually initializes much less
than it used to, and leaves most of the vma filled with random stack
garbage. In particular, it now fills with garbage the fields that TLB
flushing really can care about: things like vm_flags that says "is
this perhaps an executable-only mapping that only needs to flush the
ITLB?"
I don't actually believe that we should do vma_init() on those
on-stack vma's anyway, since they aren't "real" vma's. They are
literally crafted just for TLB flushing - filling in the vm_mm (and
sometimes vm_flags) pointers so that the TLB flushing knows what to
do.
So using "vma_init()" on them is actively detrimental as things stand
right now. The reason I looked at them was that I was trying to see
who actually uses "vm_area_alloc()" and "vma_init()" right now that
would be affected by that commit bfd40eaff5ab ("mm: fix
vma_is_anonymous() false-positives") outside of actual
honest-to-goodness device file mmaps.
Anyway, the upshot of all this is that I think I know what the ia64
problem was, and John sent the patch for the ashmem case, and I'm
going to hold off reverting that vma_is_anonymous() false-positives
commit after all.
I'm still unhappy about the vma_init() ones, and I have not decided
how to go with those. Either the memset() in vma_init(), or just
reverting the (imho unnecessary) commit 2c4541e24c55. Kirill, Andrew,
comments?
Tony, can you please double-check my commit ebad825cdd4e ("ia64: mark
special ia64 memory areas anonymous") fixes things for you? I didn't
even compile it, but it really looks so obvious that I just committed
it directly. It's not pushed out yet (I'm doing the normal full build
test because of the ashmem commit first), but it should be out in
about 20 minutes when my testing has finished.
I'd like to get this sorted out asap, although at this point I still
think that I'll have to do an rc8 even though I feel like we may have
caught everything.
Linus
On Wed, 1 Aug 2018, Linus Torvalds wrote:
>
> Anyway, the upshot of all this is that I think I know what the ia64
> problem was, and John sent the patch for the ashmem case, and I'm
> going to hold off reverting that vma_is_anonymous() false-positives
> commit after all.
I'd better send deletion of zap_pmd_range()'s VM_BUG_ON_VMA(): below
(but I've no proprietorial interest, if you prefer to do your own).
John's patch is good, and originally I thought it was safe from that
VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
insists on VM_SHARED. But afterwards read John's earlier mail,
drawing attention to the vfs_fallocate() in there: I may be wrong,
and I don't know if Android has THP in the config anyway, but it looks
to me like an unmap_mapping_range() from ashmem's vfs_fallocate()
could hit precisely the VM_BUG_ON_VMA(), once it's vma_is_anonymous().
(I'm not familiar with ashmem, and I certainly don't understand the
role of MAP_PRIVATE ashmem mappings - hole-punch's zap_pte_range()
should end up leaving any anon pages in place; but the presence of
the BUG is requiring us all to understand too much too quickly.)
[PATCH] mm: delete historical BUG from zap_pmd_range()
Delete the old VM_BUG_ON_VMA() from zap_pmd_range(), which asserted
that mmap_sem must be held when splitting an "anonymous" vma there.
Whether that's still strictly true nowadays is not entirely clear,
but the danger of sometimes crashing on the BUG is now fairly clear.
Signed-off-by: Hugh Dickins <[email protected]>
---
mm/memory.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- 4.18-rc7/mm/memory.c 2018-06-16 18:48:22.041173422 -0700
+++ linux/mm/memory.c 2018-08-01 11:01:21.397286507 -0700
@@ -1417,11 +1417,9 @@ static inline unsigned long zap_pmd_rang
do {
next = pmd_addr_end(addr, end);
if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) {
- if (next - addr != HPAGE_PMD_SIZE) {
- VM_BUG_ON_VMA(vma_is_anonymous(vma) &&
- !rwsem_is_locked(&tlb->mm->mmap_sem), vma);
+ if (next - addr != HPAGE_PMD_SIZE)
__split_huge_pmd(vma, pmd, addr, false, NULL);
- } else if (zap_huge_pmd(tlb, vma, pmd, addr))
+ else if (zap_huge_pmd(tlb, vma, pmd, addr))
goto next;
/* fall through */
}
On Wed, Aug 01, 2018 at 10:15:05AM -0700, Linus Torvalds wrote:
> Tony, can you please double-check my commit ebad825cdd4e ("ia64: mark
> special ia64 memory areas anonymous") fixes things for you? I didn't
> even compile it, but it really looks so obvious that I just committed
> it directly. It's not pushed out yet (I'm doing the normal full build
> test because of the ashmem commit first), but it should be out in
> about 20 minutes when my testing has finished.
>
> I'd like to get this sorted out asap, although at this point I still
> think that I'll have to do an rc8 even though I feel like we may have
> caught everything.
I pulled and got HEAD = 44960f2a7b63e224b1091b3e1d6f60e0cdf4be0c which
includes ebad825cdd4e.
ia64 boots fine with no issues.
Thanks
-Tony
On Wed, Aug 1, 2018 at 10:15 AM Linus Torvalds
<[email protected]> wrote:
>
> I'm still unhappy about the vma_init() ones, and I have not decided
> how to go with those. Either the memset() in vma_init(), or just
> reverting the (imho unnecessary) commit 2c4541e24c55. Kirill, Andrew,
> comments?
Ugh. Adding a memset looks simple, but screws up some places that have
other initialization. It also requires adding a new include of
<linux/string.h>, or we'd need to uninline vma_init() and put it
somewhere else.
But just reverting commit 2c4541e24c55 ("mm: use vma_init() to
initialize VMAs on stack and data segments") entirely isn't good
either, because some of the cases aren't about the TLB flush
interface, and call down to "real" VM functions. The 'pseudo_vma' use
of remove_inode_hugepages() and hugetlbfs_fallocate() in particular is
odd, but using vma_init() looks good there. And those places had the
memset() already.
So I'm inclined to simply mark the TLB-related vma_init() cases
special, and use something like this:
#define TLB_FLUSH_VMA(mm,flags) { .vm_mm = (mm), .vm_flags = (flags) }
to make it very obvious when we're doing that vma initialization for
flush_tlb_range(). It's done as an initializer, exactly so that the
only valid syntax is to do somethin glike this:
struct vm_area_struct vma = TLB_FLUSH_VMA(mm, VM_EXEC);
That leaves vma_init() users to be just the actual real allocation
path, and a few very specific specual vmas (the hugetlbfs and
mempolicy pseudo-vma, and a couple of "gate" vmas).
Suggested patch attached. Comments?
Linus
On Wed, Aug 01, 2018 at 01:05:48PM -0700, Linus Torvalds wrote:
> On Wed, Aug 1, 2018 at 10:15 AM Linus Torvalds
> <[email protected]> wrote:
> >
> > I'm still unhappy about the vma_init() ones, and I have not decided
> > how to go with those. Either the memset() in vma_init(), or just
> > reverting the (imho unnecessary) commit 2c4541e24c55. Kirill, Andrew,
> > comments?
>
> Ugh. Adding a memset looks simple, but screws up some places that have
> other initialization. It also requires adding a new include of
> <linux/string.h>, or we'd need to uninline vma_init() and put it
> somewhere else.
>
> But just reverting commit 2c4541e24c55 ("mm: use vma_init() to
> initialize VMAs on stack and data segments") entirely isn't good
> either, because some of the cases aren't about the TLB flush
> interface, and call down to "real" VM functions. The 'pseudo_vma' use
> of remove_inode_hugepages() and hugetlbfs_fallocate() in particular is
> odd, but using vma_init() looks good there. And those places had the
> memset() already.
>
> So I'm inclined to simply mark the TLB-related vma_init() cases
> special, and use something like this:
>
> #define TLB_FLUSH_VMA(mm,flags) { .vm_mm = (mm), .vm_flags = (flags) }
>
> to make it very obvious when we're doing that vma initialization for
> flush_tlb_range(). It's done as an initializer, exactly so that the
> only valid syntax is to do somethin glike this:
>
> struct vm_area_struct vma = TLB_FLUSH_VMA(mm, VM_EXEC);
>
> That leaves vma_init() users to be just the actual real allocation
> path, and a few very specific specual vmas (the hugetlbfs and
> mempolicy pseudo-vma, and a couple of "gate" vmas).
>
> Suggested patch attached. Comments?
Is there a reason why we pass vma to flush_tlb_range?
It's not obvious to me what information from VMA can be useful for an
implementation. I see that ecard.c initialize vm_flags too, but it seems
unused by flush_tlb_range.
Maybe it's cleaner to have generic helper flush_tlb_range_mm() or
something?
In longer term we can change the interface to take mm instead of vma.
--
Kirill A. Shutemov
On Wed, Aug 1, 2018 at 1:52 PM Kirill A. Shutemov <[email protected]> wrote:
>
> Is there a reason why we pass vma to flush_tlb_range?
Yes. It's even in that patch.
The fact is, real MM users *have* a vma, and passing it in to the TLB
flushing is the right thing to do. That allows architectures that care
(mainly powerpc, I think) to notice that "hey, this range only had
execute permissions, so I only need to flush the ITLB".
The people who use tlb_flush_range() any other way are doing an
arch-specific hack. It's not how tlb_flush_range() was defined, and
it's not how you can use it in general.
> It's not obvious to me what information from VMA can be useful for an
> implementation.
See the patch I sent, which had this as part of it:
- * XXX fix me: flush_tlb_range() should take an mm
pointer instead of a
- * vma pointer.
+ * flush_tlb_range() takes a vma instead of a mm pointer because
+ * some architectures want the vm_flags for ITLB/DTLB flush.
because I wanted to educate people about why the interface was what it
was, and the "fixme" was bogus shit.
> In longer term we can change the interface to take mm instead of vma.
FUCK NO!
Goddammit, read the code, or read the patch. The places ytou added
those broken vma_init() calls to were architecture-specific hacks.
Those architecture-specific hacks do not get to screw up the design
for everybody else.
Linus
On Wed, Aug 01, 2018 at 11:31:52AM -0700, Hugh Dickins wrote:
> On Wed, 1 Aug 2018, Linus Torvalds wrote:
> >
> > Anyway, the upshot of all this is that I think I know what the ia64
> > problem was, and John sent the patch for the ashmem case, and I'm
> > going to hold off reverting that vma_is_anonymous() false-positives
> > commit after all.
>
> I'd better send deletion of zap_pmd_range()'s VM_BUG_ON_VMA(): below
> (but I've no proprietorial interest, if you prefer to do your own).
Agreed.
Acked-by: Kirill A. Shutemov <[email protected]>
> John's patch is good, and originally I thought it was safe from that
> VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
> disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
> insists on VM_SHARED. But afterwards read John's earlier mail,
> drawing attention to the vfs_fallocate() in there: I may be wrong,
> and I don't know if Android has THP in the config anyway, but it looks
> to me like an unmap_mapping_range() from ashmem's vfs_fallocate()
> could hit precisely the VM_BUG_ON_VMA(), once it's vma_is_anonymous().
>
> (I'm not familiar with ashmem, and I certainly don't understand the
> role of MAP_PRIVATE ashmem mappings - hole-punch's zap_pte_range()
> should end up leaving any anon pages in place; but the presence of
> the BUG is requiring us all to understand too much too quickly.)
Hugh, do you see any reason why ashmem shouldn't have vm_ops ==
shmem_vm_ops?
I don't understand ashmem, but I feel uncomfortable that we have this
sneaky way to create an anonymous VMA. It feels wrong to me.
--
Kirill A. Shutemov
On Wed, Aug 01, 2018 at 01:56:19PM -0700, Linus Torvalds wrote:
> On Wed, Aug 1, 2018 at 1:52 PM Kirill A. Shutemov <[email protected]> wrote:
> >
> > Is there a reason why we pass vma to flush_tlb_range?
>
> Yes. It's even in that patch.
>
> The fact is, real MM users *have* a vma, and passing it in to the TLB
> flushing is the right thing to do. That allows architectures that care
> (mainly powerpc, I think) to notice that "hey, this range only had
> execute permissions, so I only need to flush the ITLB".
>
> The people who use tlb_flush_range() any other way are doing an
> arch-specific hack. It's not how tlb_flush_range() was defined, and
> it's not how you can use it in general.
Okay, I see.
ARM, unicore32 and xtensa avoid iTLB flush for non-executable VMAs.
>
> > It's not obvious to me what information from VMA can be useful for an
> > implementation.
>
> See the patch I sent, which had this as part of it:
>
> - * XXX fix me: flush_tlb_range() should take an mm
> pointer instead of a
> - * vma pointer.
> + * flush_tlb_range() takes a vma instead of a mm pointer because
> + * some architectures want the vm_flags for ITLB/DTLB flush.
>
> because I wanted to educate people about why the interface was what it
> was, and the "fixme" was bogus shit.
I didn't noticied this. Sorry.
--
Kirill A. Shutemov
On Wed, 1 Aug 2018, Kirill A. Shutemov wrote:
> On Wed, Aug 01, 2018 at 11:31:52AM -0700, Hugh Dickins wrote:
> > On Wed, 1 Aug 2018, Linus Torvalds wrote:
> > >
> > > Anyway, the upshot of all this is that I think I know what the ia64
> > > problem was, and John sent the patch for the ashmem case, and I'm
> > > going to hold off reverting that vma_is_anonymous() false-positives
> > > commit after all.
> >
> > I'd better send deletion of zap_pmd_range()'s VM_BUG_ON_VMA(): below
> > (but I've no proprietorial interest, if you prefer to do your own).
>
> Agreed.
>
> Acked-by: Kirill A. Shutemov <[email protected]>
Thanks.
>
> > John's patch is good, and originally I thought it was safe from that
> > VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
> > disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
> > insists on VM_SHARED. But afterwards read John's earlier mail,
> > drawing attention to the vfs_fallocate() in there: I may be wrong,
> > and I don't know if Android has THP in the config anyway, but it looks
> > to me like an unmap_mapping_range() from ashmem's vfs_fallocate()
> > could hit precisely the VM_BUG_ON_VMA(), once it's vma_is_anonymous().
> >
> > (I'm not familiar with ashmem, and I certainly don't understand the
> > role of MAP_PRIVATE ashmem mappings - hole-punch's zap_pte_range()
> > should end up leaving any anon pages in place; but the presence of
> > the BUG is requiring us all to understand too much too quickly.)
>
> Hugh, do you see any reason why ashmem shouldn't have vm_ops ==
> shmem_vm_ops?
I cannot immediately think of an absolute reason why not, but I'm
not giving it much thought; and that might turn it into a stranger
beast than it already is.
>
> I don't understand ashmem, but I feel uncomfortable that we have this
> sneaky way to create an anonymous VMA. It feels wrong to me.
I agree it's odd, but in this respect it's no odder than /dev/zero:
that has exactly the same pattern of shmem_zero_setup() for VM_SHARED,
else vma_set_anonymous(): which made me comfortable with John's patch,
restoring the way it worked before.
Admittedly, the subsequent vfs_fallocate() might generate surprises;
and the business of doing a shmem_file_setup() first, and then undoing
it with a shmem_zero_setup(), looks weird - from John's old XXX comment,
I think it was a quick hack to piece together some functionality they
needed in a hurry, which never got revisited (they wanted a name for
the area? maybe memfd would be good for that now).
But if what's in there is working now, I do not want to mess with it:
I'd be adding bugs faster than removing them.
Hugh
On Wed, 1 Aug 2018 at 22:45, Linus Torvalds
<[email protected]> wrote:
>
> I'd like to get this sorted out asap, although at this point I still
> think that I'll have to do an rc8 even though I feel like we may have
> caught everything.
No AOSP regressions in my limited smoke testing so far with
current HEAD: 6b4703768268 ("Merge branch 'fixes' of
git://git.armlinux.org.uk/~rmk/linux-arm"). Thanks.
Regards,
Amit Pundir
On Wed, Aug 1, 2018 at 2:55 PM, Hugh Dickins <[email protected]> wrote:
> On Wed, 1 Aug 2018, Kirill A. Shutemov wrote:
>> On Wed, Aug 01, 2018 at 11:31:52AM -0700, Hugh Dickins wrote:
>> > On Wed, 1 Aug 2018, Linus Torvalds wrote:
>> > >
>> > > Anyway, the upshot of all this is that I think I know what the ia64
>> > > problem was, and John sent the patch for the ashmem case, and I'm
>> > > going to hold off reverting that vma_is_anonymous() false-positives
>> > > commit after all.
>> >
>> > I'd better send deletion of zap_pmd_range()'s VM_BUG_ON_VMA(): below
>> > (but I've no proprietorial interest, if you prefer to do your own).
>>
>> Agreed.
>>
>> Acked-by: Kirill A. Shutemov <[email protected]>
>
> Thanks.
>
>>
>> > John's patch is good, and originally I thought it was safe from that
>> > VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is
>> > disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE)
>> > insists on VM_SHARED. But afterwards read John's earlier mail,
>> > drawing attention to the vfs_fallocate() in there: I may be wrong,
>> > and I don't know if Android has THP in the config anyway, but it looks
>> > to me like an unmap_mapping_range() from ashmem's vfs_fallocate()
>> > could hit precisely the VM_BUG_ON_VMA(), once it's vma_is_anonymous().
>> >
>> > (I'm not familiar with ashmem, and I certainly don't understand the
>> > role of MAP_PRIVATE ashmem mappings - hole-punch's zap_pte_range()
>> > should end up leaving any anon pages in place; but the presence of
>> > the BUG is requiring us all to understand too much too quickly.)
>>
>> Hugh, do you see any reason why ashmem shouldn't have vm_ops ==
>> shmem_vm_ops?
>
> I cannot immediately think of an absolute reason why not, but I'm
> not giving it much thought; and that might turn it into a stranger
> beast than it already is.
>
>>
>> I don't understand ashmem, but I feel uncomfortable that we have this
>> sneaky way to create an anonymous VMA. It feels wrong to me.
>
> I agree it's odd, but in this respect it's no odder than /dev/zero:
> that has exactly the same pattern of shmem_zero_setup() for VM_SHARED,
> else vma_set_anonymous(): which made me comfortable with John's patch,
> restoring the way it worked before.
>
> Admittedly, the subsequent vfs_fallocate() might generate surprises;
> and the business of doing a shmem_file_setup() first, and then undoing
> it with a shmem_zero_setup(), looks weird - from John's old XXX comment,
Yes, it is weird. :)
> I think it was a quick hack to piece together some functionality they
> needed in a hurry, which never got revisited (they wanted a name for
> the area? maybe memfd would be good for that now).
So my XXX comment is to do with a change I made to ashmem in order for
it to go into staging, compared with the original Android
implementation. They still carry the patch to undo it here:
https://android.googlesource.com/kernel/common.git/+/ebfc8d6476a66dc91a1b30cbfc18e67d4401af09%5E%21/
But it has more to do w/ in the shared mapping case, providing a
cleaner way of setting the vma->vm_ops = &shmem_vm_ops without having
to use shmem_zero_setup(), and doesn't change the behavior in the
private mapping case.
This discussion has spurred Joel and I to chat a bit about reviving
the effort to "properly" upstream ashmem. We're still in discussion
but I'm thinking we might just try to add the pin/unpin functionality
to memfd. We'll see how the discussion evolves.
thanks
-john