2014-10-28 03:36:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 000/146] 3.17.2-stable review

This is the start of the stable review cycle for the 3.17.2 release.
There are 146 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Oct 30 03:32:38 UTC 2014.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.17.2-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 3.17.2-rc1

David S. Miller <[email protected]>
sparc64: Implement __get_user_pages_fast().

David S. Miller <[email protected]>
sparc64: Fix register corruption in top-most kernel stack frame during boot.

Dave Kleikamp <[email protected]>
sparc64: Increase size of boot string to 1024 bytes

David S. Miller <[email protected]>
sparc64: Kill unnecessary tables and increase MAX_BANKS.

bob picco <[email protected]>
sparc64: sparse irq

David S. Miller <[email protected]>
sparc64: Adjust vmalloc region size based upon available virtual address bits.

David S. Miller <[email protected]>
sparc64: Increase MAX_PHYS_ADDRESS_BITS to 53.

David S. Miller <[email protected]>
sparc64: Use kernel page tables for vmemmap.

David S. Miller <[email protected]>
sparc64: Fix physical memory management regressions with large max_phys_bits.

David S. Miller <[email protected]>
sparc64: Adjust KTSB assembler to support larger physical addresses.

David S. Miller <[email protected]>
sparc64: Define VA hole at run time, rather than at compile time.

David S. Miller <[email protected]>
sparc64: Switch to 4-level page tables.

bob picco <[email protected]>
sparc64: T5 PMU

Allen Pais <[email protected]>
sparc64: cpu hardware caps support for sparc M6 and M7

Allen Pais <[email protected]>
sparc64: support M6 and M7 for building CPU distribution map

Allen Pais <[email protected]>
sparc64: correctly recognise M6 and M7 cpu type

David S. Miller <[email protected]>
sparc64: Fix hibernation code refrence to PAGE_OFFSET.

David S. Miller <[email protected]>
sparc64: Do not define thread fpregs save area as zero-length array.

David S. Miller <[email protected]>
sparc64: Fix FPU register corruption with AES crypto offload.

David S. Miller <[email protected]>
sparc64: Fix lockdep warnings on reboot on Ultra-5

David S. Miller <[email protected]>
sparc64: Fix reversed start/end in flush_tlb_kernel_range()

Andreas Larsson <[email protected]>
sparc: Let memset return the address argument

Sowmini Varadhan <[email protected]>
sparc64: Move request_irq() from ldc_bind() to ldc_alloc()

bob picco <[email protected]>
sparc64: find_node adjustment

David S. Miller <[email protected]>
sparc64: Fix corrupted thread fault code.

bob picco <[email protected]>
sparc64: sun4v TLB error power off events

Daniel Hellstrom <[email protected]>
sparc32: dma_alloc_coherent must honour gfp flags

Dmitry Kasatkin <[email protected]>
ima: pass 'opened' flag to identify newly created files

Dmitry Kasatkin <[email protected]>
ima: provide flag to identify new empty files

Dmitry Kasatkin <[email protected]>
ima: fix fallback to use new_sync_read()

Gavin Shan <[email protected]>
powerpc/eeh: Clear frozen device state in time

Alexey Kardashevskiy <[email protected]>
powerpc/iommu/ddw: Fix endianness

Li Zhong <[email protected]>
powerpc: Only set numa node information for present cpus at boottime

Li Zhong <[email protected]>
powerpc: Fix warning reported by verify_cpu_node_mapping()

Catalin Marinas <[email protected]>
futex: Ensure get_futex_key_refs() always implies a barrier

Konstantin Khlebnikov <[email protected]>
mm/balloon_compaction: redesign ballooned pages management

Daniel Glöckner <[email protected]>
rtc-cmos: fix wakeup from S5 without CONFIG_PM_SLEEP

Sasha Levin <[email protected]>
kernel: add support for gcc 5

Weijie Yang <[email protected]>
mm/cma: fix cma bitmap aligned mask computing

Yann Droneaud <[email protected]>
fanotify: enable close-on-exec on events' fd when requested in fanotify_init()

Junxiao Bi <[email protected]>
mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set

Jukka Rissanen <[email protected]>
Bluetooth: 6lowpan: Route packets that are not meant to peer via correct device

Jukka Rissanen <[email protected]>
Bluetooth: 6lowpan: Set the peer IPv6 address correctly

Jukka Rissanen <[email protected]>
Bluetooth: 6lowpan: Increase the connection timeout value

Johan Hedberg <[email protected]>
Bluetooth: Fix setting correct security level when initiating SMP

Champion Chen <[email protected]>
Bluetooth: Fix issue with USB suspend in btusb driver

Johan Hedberg <[email protected]>
Bluetooth: Fix incorrect LE CoC PDU length restriction based on HCI MTU

Loic Poulain <[email protected]>
Bluetooth: Fix HCI H5 corrupted ack value

Felix Fietkau <[email protected]>
Revert "ath9k_hw: reduce ANI firstep range for older chips"

Stanislaw Gruszka <[email protected]>
rt2800: correct BBP1_TX_POWER_CTRL mask

Ricardo Ribalda Delgado <[email protected]>
PCI: Generate uppercase hex for modalias interface class

Douglas Lehr <[email protected]>
PCI: Increase IBM ipr SAS Crocodile BARs to at least system page size

Yinghai Lu <[email protected]>
PCI: Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources()

Thomas Petazzoni <[email protected]>
PCI: mvebu: Fix uninitialized variable in mvebu_get_tgt_attr()

Eric Sandeen <[email protected]>
xfs: fix agno increment in xfs_inumbers() loop

Dave Chinner <[email protected]>
xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly

Alex Williamson <[email protected]>
vfio-pci: Fix remove path locking

Jan Kara <[email protected]>
udf: Fix loading of special inodes

Chao Yu <[email protected]>
ecryptfs: avoid to access NULL pointer when write metadata in xattr

Fabio Estevam <[email protected]>
ARM: dts: imx28-evk: Let i2c0 run at 100kHz

[email protected] <[email protected]>
ARM: mvebu: Netgear RN102: Use Hardware BCH ECC

Arnaud Ebalard <[email protected]>
ARM: mvebu: Netgear RN2120: Use Hardware BCH ECC

Arnaud Ebalard <[email protected]>
ARM: mvebu: Netgear RN104: Use Hardware BCH ECC

Andrew Lunn <[email protected]>
ARM: Kirkwood: Fix DT based DSA.

Ludovic Desroches <[email protected]>
ARM: at91/PMC: don't forget to write PMC_PCDR register to disable clocks

Andreas Henriksson <[email protected]>
ARM: at91: fix at91sam9263ek DT mmc pinmuxing settings

David Dueck <[email protected]>
ARM: at91/dt: Fix typo regarding can0_clk

Andy Gross <[email protected]>
clk: qcom: Add IPQ8064 PLL required for USB

David Henningsson <[email protected]>
ALSA: hda - Add missing terminating entry to SND_HDA_PIN_QUIRK macro

Takashi Iwai <[email protected]>
ALSA: hda - Fix inverted LED gpio setup for Lenovo Ideapad

Anssi Hannula <[email protected]>
ALSA: hda - hdmi: Fix missing ELD change event on plug/unplug

Vlad Catoi <[email protected]>
ALSA: usb-audio: Add support for Steinberg UR22 USB interface

Harsha Priya <[email protected]>
ALSA: ALC283 codec - Avoid pop noise on headphones during suspend/resume

Takashi Iwai <[email protected]>
ALSA: emu10k1: Fix deadlock in synth voice lookup

Takashi Sakamoto <[email protected]>
ALSA: bebob: Fix failure to detect source of clock for Terratec Phase 88

Anatol Pomozov <[email protected]>
ALSA: pcm: use the same dma mmap codepath both for arm and arm64

Victor Kamensky <[email protected]>
arm64: compat: fix compat types affecting struct compat_elf_prpsinfo

Catalin Marinas <[email protected]>
arm64: Fix compilation error on UP builds

Andy Shevchenko <[email protected]>
spi: dw-mid: terminate ongoing transfers at exit

Oren Givon <[email protected]>
iwlwifi: Add missing PCI IDs for the 7260 series

Emmanuel Grumbach <[email protected]>
iwlwifi: mvm: disable BT Co-running by default

Trond Myklebust <[email protected]>
NFSv4.1/pnfs: replace broken pnfs_put_lseg_async

Trond Myklebust <[email protected]>
NFS: Fix a bogus warning in nfs_generic_pgio

Trond Myklebust <[email protected]>
NFS: Fix an uninitialised pointer Oops in the writeback error path

J. Bruce Fields <[email protected]>
nfsd4: reserve adequate space for LOCK op

Andy Adamson <[email protected]>
NFSv4.1: Fix an NFSv4.1 state renewal regression

Trond Myklebust <[email protected]>
NFSv4: fix open/lock state recovery error handling

Trond Myklebust <[email protected]>
NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails

Fabian Frederick <[email protected]>
nfs: fix duplicate proc entries

Frans Klaver <[email protected]>
tty: omap-serial: fix division by zero

Willy Tarreau <[email protected]>
lzo: check for length overrun in variable length encoding.

Willy Tarreau <[email protected]>
Revert "lzo: properly check for overruns"

Willy Tarreau <[email protected]>
Documentation: lzo: document part of the encoding

Olga Kornievskaia <[email protected]>
Fixing lease renewal

Geert Uytterhoeven <[email protected]>
m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()

Alexander Usyskin <[email protected]>
mei: bus: fix possible boundaries violation

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Cleanup hv_post_message()

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Fix a bug in vmbus_open()

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Cleanup vmbus_establish_gpadl()

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Cleanup vmbus_close_internal()

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Cleanup vmbus_teardown_gpadl()

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Cleanup vmbus_post_msg()

K. Y. Srinivasan <[email protected]>
Drivers: hv: util: Properly pack the data for file copy functionality

Will Deacon <[email protected]>
arm64: debug: don't re-enable debug exceptions on return from el1_dbg

Kees Cook <[email protected]>
firmware_class: make sure fw requests contain a name

Krzysztof Kozlowski <[email protected]>
dmaengine: pl330: Fix NULL pointer dereference on driver unbind

Krzysztof Kozlowski <[email protected]>
dmaengine: pl330: Fix NULL pointer dereference on probe failure

Xuelin Shi <[email protected]>
dmaengine: fix xor sources continuation

Joe Lawrence <[email protected]>
qla2xxx: Fix shost use-after-free on device removal

Arun Easi <[email protected]>
qla2xxx: Use correct offset to req-q-out for reserve calculation

Himanshu Madhani <[email protected]>
qla2xxx: fix kernel NULL pointer access

Steffen Trumtrar <[email protected]>
regulator: ltc3589: fix broken voltage transitions

Chris J Arges <[email protected]>
mptfusion: enable no_write_same for vmware scsi disks

Mike Christie <[email protected]>
be2iscsi: check ip buffer before copying

Xiubo Li <[email protected]>
regmap: fix possible ZERO_SIZE_PTR pointer dereferencing error.

Pankaj Dubey <[email protected]>
regmap: fix NULL pointer dereference in _regmap_write/read

Xiubo Li <[email protected]>
regmap: debugfs: fix possbile NULL pointer dereference

Borislav Petkov <[email protected]>
mpc85xx_edac: Make L2 interrupt shared too

Benjamin Tissoires <[email protected]>
HID: rmi: check sanity of the incoming report

Benjamin Tissoires <[email protected]>
HID: wacom: fix timeout on probe for some wacoms

Ping Cheng <[email protected]>
HID: wacom - remove report_id from wacom_get_report interface

Andy Shevchenko <[email protected]>
spi: dw-mid: check that DMA was inited before exit

Addy Ke <[email protected]>
spi/rockchip: fix bug that cause the failure to read data in DMA mode

Andy Shevchenko <[email protected]>
spi: dw-mid: respect 8 bit mode

Bryan O'Donoghue <[email protected]>
x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead

Andy Lutomirski <[email protected]>
x86,kvm,vmx: Preserve CR4 across VM entry

David Matlack <[email protected]>
kvm: don't take vcpu mutex for obviously invalid vcpu ioctls

Christian Borntraeger <[email protected]>
KVM: s390: unintended fallthrough for external call

Paolo Bonzini <[email protected]>
KVM: do not bias the generation number in kvm_current_mmio_generation

David Matlack <[email protected]>
kvm: fix potentially corrupt mmio cache

David Matlack <[email protected]>
kvm: x86: fix stale mmio cache bug

Josef Ahmad <[email protected]>
pci_ids: Add support for Intel Quark ILB

Andy Lutomirski <[email protected]>
fs: Add a missing permission check to do_umount

Chris Mason <[email protected]>
Revert "Btrfs: race free update of commit root for ro snapshots"

Sage Weil <[email protected]>
Btrfs: fix race in WAIT_SYNC ioctl

Qu Wenruo <[email protected]>
btrfs: Fix the wrong condition judgment about subset extent map

Josef Bacik <[email protected]>
Btrfs: fix build_backref_tree issue with multiple shared blocks

Josef Bacik <[email protected]>
Btrfs: cleanup error handling in build_backref_tree

Josef Bacik <[email protected]>
Btrfs: try not to ENOSPC on log replay

Josef Bacik <[email protected]>
Btrfs: don't do async reclaim during log replay

Qu Wenruo <[email protected]>
btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map

Liu Bo <[email protected]>
Btrfs: fix up bounds checking in lseek

Filipe Manana <[email protected]>
Btrfs: add missing compression property remove in btrfs_ioctl_setflags

Qu Wenruo <[email protected]>
btrfs: Fix a deadlock in btrfs_dev_replace_finishing()

Mark Fasheh <[email protected]>
btrfs: don't go readonly on existing qgroup items

David Sterba <[email protected]>
btrfs: wake up transaction thread from SYNC_FS ioctl


-------------

Diffstat:

Documentation/lzo.txt | 164 ++++++++
Documentation/virtual/kvm/mmu.txt | 14 +
Makefile | 4 +-
arch/arm/boot/dts/Makefile | 4 +-
arch/arm/boot/dts/armada-370-netgear-rn102.dts | 4 +
arch/arm/boot/dts/armada-370-netgear-rn104.dts | 4 +
arch/arm/boot/dts/armada-xp-netgear-rn2120.dts | 4 +
arch/arm/boot/dts/at91sam9263.dtsi | 2 +
arch/arm/boot/dts/imx28-evk.dts | 1 -
arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts | 16 +-
arch/arm/boot/dts/kirkwood-rd88f6281-a.dts | 43 ++
arch/arm/boot/dts/kirkwood-rd88f6281-a0.dts | 26 --
arch/arm/boot/dts/kirkwood-rd88f6281-a1.dts | 31 --
arch/arm/boot/dts/kirkwood-rd88f6281-z0.dts | 35 ++
arch/arm/boot/dts/kirkwood-rd88f6281.dtsi | 27 +-
arch/arm/boot/dts/kirkwood.dtsi | 4 +-
arch/arm/boot/dts/sama5d3_can.dtsi | 2 +-
arch/arm/mach-at91/clock.c | 1 +
arch/arm64/include/asm/compat.h | 4 +-
arch/arm64/include/asm/irq_work.h | 11 +
arch/arm64/kernel/entry.S | 1 -
arch/m68k/mm/hwtest.c | 6 +
arch/powerpc/kernel/eeh_pe.c | 21 +-
arch/powerpc/kernel/smp.c | 10 +-
arch/powerpc/mm/numa.c | 5 +-
arch/powerpc/platforms/pseries/iommu.c | 51 ++-
arch/s390/kvm/interrupt.c | 1 +
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/hypervisor.h | 11 +
arch/sparc/include/asm/irq_64.h | 7 +-
arch/sparc/include/asm/ldc.h | 5 +-
arch/sparc/include/asm/oplib_64.h | 3 +-
arch/sparc/include/asm/page_64.h | 33 +-
arch/sparc/include/asm/pgalloc_64.h | 28 +-
arch/sparc/include/asm/pgtable_64.h | 100 +++--
arch/sparc/include/asm/setup.h | 2 +
arch/sparc/include/asm/spitfire.h | 2 +
arch/sparc/include/asm/thread_info_64.h | 4 +-
arch/sparc/include/asm/tsb.h | 87 ++--
arch/sparc/include/asm/visasm.h | 8 +
arch/sparc/kernel/cpu.c | 12 +
arch/sparc/kernel/cpumap.c | 2 +
arch/sparc/kernel/ds.c | 4 +-
arch/sparc/kernel/dtlb_prot.S | 6 +-
arch/sparc/kernel/entry.h | 3 -
arch/sparc/kernel/head_64.S | 52 +--
arch/sparc/kernel/hvapi.c | 1 +
arch/sparc/kernel/hvcalls.S | 16 +
arch/sparc/kernel/hvtramp.S | 1 -
arch/sparc/kernel/ioport.c | 5 +-
arch/sparc/kernel/irq_64.c | 507 ++++++++++++++--------
arch/sparc/kernel/ktlb.S | 125 +-----
arch/sparc/kernel/ldc.c | 41 +-
arch/sparc/kernel/pcr.c | 47 ++-
arch/sparc/kernel/perf_event.c | 3 +-
arch/sparc/kernel/setup_64.c | 36 +-
arch/sparc/kernel/smp_64.c | 7 +
arch/sparc/kernel/sun4v_tlb_miss.S | 35 +-
arch/sparc/kernel/trampoline_64.S | 12 +-
arch/sparc/kernel/traps_64.c | 15 +-
arch/sparc/kernel/tsb.S | 6 +-
arch/sparc/kernel/viohs.c | 4 +-
arch/sparc/kernel/vmlinux.lds.S | 10 +-
arch/sparc/lib/NG4memcpy.S | 14 +-
arch/sparc/lib/memset.S | 18 +-
arch/sparc/mm/fault_64.c | 3 +
arch/sparc/mm/gup.c | 30 ++
arch/sparc/mm/init_64.c | 554 +++++++++++++------------
arch/sparc/mm/init_64.h | 18 -
arch/sparc/power/hibernate_asm.S | 4 +-
arch/sparc/prom/bootstr_64.c | 5 +-
arch/sparc/prom/cif.S | 5 +-
arch/sparc/prom/init_64.c | 6 +-
arch/sparc/prom/p1275.c | 9 +-
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kernel/cpu/intel.c | 15 +
arch/x86/kvm/mmu.c | 29 +-
arch/x86/kvm/vmx.c | 16 +-
arch/x86/kvm/x86.h | 20 +-
crypto/async_tx/async_xor.c | 3 +-
drivers/base/firmware_class.c | 3 +
drivers/base/regmap/regmap-debugfs.c | 8 +-
drivers/base/regmap/regmap.c | 7 +-
drivers/bluetooth/btusb.c | 9 +
drivers/bluetooth/hci_h5.c | 2 +-
drivers/clk/qcom/gcc-ipq806x.c | 31 +-
drivers/dma/pl330.c | 12 +-
drivers/edac/mpc85xx_edac.c | 2 +-
drivers/hid/hid-rmi.c | 44 +-
drivers/hid/wacom_sys.c | 26 +-
drivers/hv/channel.c | 49 ++-
drivers/hv/connection.c | 17 +-
drivers/hv/hv.c | 27 +-
drivers/hv/hyperv_vmbus.h | 4 +
drivers/message/fusion/mptspi.c | 5 +
drivers/misc/mei/bus.c | 2 +-
drivers/net/wireless/ath/ath9k/ar5008_phy.c | 4 +-
drivers/net/wireless/iwlwifi/mvm/constants.h | 2 +-
drivers/net/wireless/iwlwifi/pcie/drv.c | 4 +
drivers/net/wireless/rt2x00/rt2800.h | 2 +-
drivers/pci/host/pci-mvebu.c | 6 +-
drivers/pci/pci-sysfs.c | 2 +-
drivers/pci/quirks.c | 20 +
drivers/pci/setup-bus.c | 2 +-
drivers/regulator/ltc3589.c | 1 +
drivers/rtc/rtc-cmos.c | 5 +-
drivers/scsi/be2iscsi/be_mgmt.c | 13 +-
drivers/scsi/qla2xxx/qla_os.c | 6 +-
drivers/scsi/qla2xxx/qla_target.c | 5 +-
drivers/spi/spi-dw-mid.c | 10 +-
drivers/spi/spi-rockchip.c | 15 +-
drivers/tty/serial/omap-serial.c | 12 +-
drivers/vfio/pci/vfio_pci.c | 136 +++---
drivers/virtio/virtio_balloon.c | 15 +-
fs/btrfs/dev-replace.c | 3 +-
fs/btrfs/extent-tree.c | 8 +-
fs/btrfs/file.c | 25 +-
fs/btrfs/inode.c | 118 +++---
fs/btrfs/ioctl.c | 42 ++
fs/btrfs/qgroup.c | 10 +-
fs/btrfs/relocation.c | 93 +++--
fs/btrfs/transaction.c | 12 +-
fs/ecryptfs/inode.c | 2 +-
fs/namei.c | 2 +-
fs/namespace.c | 2 +
fs/nfs/client.c | 2 +-
fs/nfs/filelayout/filelayout.c | 2 +-
fs/nfs/nfs4proc.c | 2 +-
fs/nfs/nfs4renewd.c | 12 +-
fs/nfs/nfs4state.c | 18 +-
fs/nfs/pagelist.c | 10 +-
fs/nfs/pnfs.c | 33 +-
fs/nfs/pnfs.h | 6 +-
fs/nfsd/nfs4xdr.c | 8 +
fs/nfsd/vfs.c | 2 +-
fs/notify/fanotify/fanotify_user.c | 2 +-
fs/udf/inode.c | 14 +-
fs/udf/super.c | 10 +-
fs/udf/udfdecl.h | 13 +-
fs/xfs/xfs_aops.c | 16 +-
fs/xfs/xfs_itable.c | 3 +-
include/linux/balloon_compaction.h | 97 ++---
include/linux/compiler-gcc5.h | 66 +++
include/linux/ima.h | 4 +-
include/linux/migrate.h | 11 +-
include/linux/mm.h | 19 +
include/linux/pci_ids.h | 1 +
include/linux/sched.h | 6 +-
include/uapi/linux/hyperv.h | 2 +-
kernel/futex.c | 2 +
lib/lzo/lzo1x_decompress_safe.c | 103 +++--
mm/balloon_compaction.c | 26 +-
mm/cma.c | 4 +-
mm/compaction.c | 2 +-
mm/migrate.c | 16 +-
net/bluetooth/6lowpan.c | 80 +++-
net/bluetooth/l2cap_core.c | 6 +-
net/bluetooth/smp.c | 5 +-
security/integrity/ima/ima.h | 4 +-
security/integrity/ima/ima_appraise.c | 9 +-
security/integrity/ima/ima_crypto.c | 8 +-
security/integrity/ima/ima_main.c | 28 +-
security/integrity/integrity.h | 1 +
sound/core/pcm_native.c | 2 +-
sound/firewire/bebob/bebob_terratec.c | 4 +-
sound/pci/emu10k1/emu10k1_callback.c | 6 +-
sound/pci/hda/hda_local.h | 4 +-
sound/pci/hda/patch_hdmi.c | 15 +-
sound/pci/hda/patch_realtek.c | 7 +-
sound/usb/quirks-table.h | 30 ++
virt/kvm/kvm_main.c | 34 +-
171 files changed, 2566 insertions(+), 1501 deletions(-)


2014-10-28 03:36:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 001/146] btrfs: wake up transaction thread from SYNC_FS ioctl

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Sterba <[email protected]>

commit 2fad4e83e12591eb3bd213875b9edc2d18e93383 upstream.

The transaction thread may want to do more work, namely it pokes the
cleaner ktread that will start processing uncleaned subvols.

This can be triggered by user via the 'btrfs fi sync' command, otherwise
there was a delay up to 30 seconds before the cleaner started to clean
old snapshots.

Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/ioctl.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5283,6 +5283,12 @@ long btrfs_ioctl(struct file *file, unsi
if (ret)
return ret;
ret = btrfs_sync_fs(file->f_dentry->d_sb, 1);
+ /*
+ * The transaction thread may want to do more work,
+ * namely it pokes the cleaner ktread that will start
+ * processing uncleaned subvols.
+ */
+ wake_up_process(root->fs_info->transaction_kthread);
return ret;
}
case BTRFS_IOC_START_SYNC:

2014-10-28 03:36:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 012/146] Btrfs: fix race in WAIT_SYNC ioctl

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sage Weil <[email protected]>

commit 42383020beb1cfb05f5d330cc311931bc4917a97 upstream.

We check whether transid is already committed via last_trans_committed and
then search through trans_list for pending transactions. If
last_trans_committed is updated by btrfs_commit_transaction after we check
it (there is no locking), we will fail to find the committed transaction
and return EINVAL to the caller. This has been observed occasionally by
ceph-osd (which uses this ioctl heavily).

Fix by rechecking whether the provided transid <= last_trans_committed
after the search fails, and if so return 0.

Signed-off-by: Sage Weil <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/transaction.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -609,7 +609,6 @@ int btrfs_wait_for_commit(struct btrfs_r
if (transid <= root->fs_info->last_trans_committed)
goto out;

- ret = -EINVAL;
/* find specified transaction */
spin_lock(&root->fs_info->trans_lock);
list_for_each_entry(t, &root->fs_info->trans_list, list) {
@@ -625,9 +624,16 @@ int btrfs_wait_for_commit(struct btrfs_r
}
}
spin_unlock(&root->fs_info->trans_lock);
- /* The specified transaction doesn't exist */
- if (!cur_trans)
+
+ /*
+ * The specified transaction doesn't exist, or we
+ * raced with btrfs_commit_transaction
+ */
+ if (!cur_trans) {
+ if (transid > root->fs_info->last_trans_committed)
+ ret = -EINVAL;
goto out;
+ }
} else {
/* find newest transaction that is committing | committed */
spin_lock(&root->fs_info->trans_lock);

2014-10-28 03:36:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 003/146] btrfs: Fix a deadlock in btrfs_dev_replace_finishing()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Qu Wenruo <[email protected]>

commit 12b894cb288d57292b01cf158177b6d5c89a6272 upstream.

btrfs-transacion:5657
[stack snip]
btrfs_bio_map()
btrfs_bio_counter_inc_blocked()
percpu_counter_inc(&fs_info->bio_counter) ###bio_counter > 0(A)
__btrfs_bio_map()
btrfs_dev_replace_lock()
mutex_lock(dev_replace->lock) ###wait mutex(B)

btrfs:32612
[stack snip]
btrfs_dev_replace_start()
btrfs_dev_replace_lock()
mutex_lock(dev_replace->lock) ###hold mutex(B)
btrfs_dev_replace_finishing()
btrfs_rm_dev_replace_blocked()
wait until percpu_counter_sum == 0 ###wait on bio_counter(A)

This bug can be triggered quite easily by the following test script:
http://pastebin.com/MQmb37Cy

This patch will fix the ABBA problem by calling
btrfs_dev_replace_unlock() before btrfs_rm_dev_replace_blocked().

The consistency of btrfs devices list and their superblocks is protected
by device_list_mutex, not btrfs_dev_replace_lock/unlock().
So it is safe the move btrfs_dev_replace_unlock() before
btrfs_rm_dev_replace_blocked().

Reported-by: Zhao Lei <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Cc: Stefan Behrens <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/dev-replace.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -567,6 +567,8 @@ static int btrfs_dev_replace_finishing(s
btrfs_kobj_rm_device(fs_info, src_device);
btrfs_kobj_add_device(fs_info, tgt_device);

+ btrfs_dev_replace_unlock(dev_replace);
+
btrfs_rm_dev_replace_blocked(fs_info);

btrfs_rm_dev_replace_srcdev(fs_info, src_device);
@@ -580,7 +582,6 @@ static int btrfs_dev_replace_finishing(s
* superblock is scratched out so that it is no longer marked to
* belong to this filesystem.
*/
- btrfs_dev_replace_unlock(dev_replace);
mutex_unlock(&root->fs_info->fs_devices->device_list_mutex);
mutex_unlock(&root->fs_info->chunk_mutex);


2014-10-28 03:36:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 008/146] Btrfs: try not to ENOSPC on log replay

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josef Bacik <[email protected]>

commit 1d52c78afbbf80b58299e076a159617d6b42fe3c upstream.

When doing log replay we may have to update inodes, which traditionally goes
through our delayed inode stuff. This will try to move space over from the
trans handle, but we don't reserve space in our trans handle on replay since we
don't know how much we will need, so instead we try to flush. But because we
have a trans handle open we won't flush anything, so if we are out of reserve
space we will simply return ENOSPC. Since we know that if an operation made it
into the log then we definitely had space before the box bought the farm then we
don't need to worry about doing this space reservation. Use the
fs_info->log_root_recovering flag to skip the delayed inode stuff and update the
item directly. Thanks,

Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/inode.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3662,7 +3662,8 @@ noinline int btrfs_update_inode(struct b
* without delay
*/
if (!btrfs_is_free_space_inode(inode)
- && root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID) {
+ && root->root_key.objectid != BTRFS_DATA_RELOC_TREE_OBJECTID
+ && !root->fs_info->log_root_recovering) {
btrfs_update_root_times(trans, root);

ret = btrfs_delayed_update_inode(trans, root, inode);

2014-10-28 03:36:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 006/146] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Qu Wenruo <[email protected]>

commit e6c4efd87ab04e5ead363f24e6ac35ed3506d401 upstream.

The following commit enhanced the merge_extent_mapping() to reduce
fragment in extent map tree, but it can't handle case which existing
lies before map_start:
51f39 btrfs: Use right extent length when inserting overlap extent map.

[BUG]
When existing extent map's start is before map_start,
the em->len will be minus, which will corrupt the extent map and fail to
insert the new extent map.
This will happen when someone get a large extent map, but when it is
going to insert it into extent map tree, some one has already commit
some write and split the huge extent into small parts.

[REPRODUCER]
It is very easy to tiger using filebench with randomrw personality.
It is about 100% to reproduce when using 8G preallocated file in 60s
randonrw test.

[FIX]
This patch can now handle any existing extent position.
Since it does not directly use existing->start, now it will find the
previous and next extent around map_start.
So the old existing->start < map_start bug will never happen again.

[ENHANCE]
This patch will insert the best fitted extent map into extent map tree,
other than the oldest [map_start, map_start + sectorsize) or the
relatively newer but not perfect [map_start, existing->start).

The patch will first search existing extent that does not intersects with
the desired map range [map_start, map_start + len).
The existing extent will be either before or behind map_start, and based
on the existing extent, we can find out the previous and next extent
around map_start.

So the best fitted extent would be [prev->end, next->start).
For prev or next is not found, em->start would be prev->end and em->end
wold be next->start.

With this patch, the fragment in extent map tree should be reduced much
more than the 51f39 commit and reduce an unneeded extent map tree search.

Reported-by: Tsutomu Itoh <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/inode.c | 79 +++++++++++++++++++++++++++++++++++++++----------------
1 file changed, 57 insertions(+), 22 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6191,21 +6191,60 @@ out_fail_inode:
goto out_fail;
}

+/* Find next extent map of a given extent map, caller needs to ensure locks */
+static struct extent_map *next_extent_map(struct extent_map *em)
+{
+ struct rb_node *next;
+
+ next = rb_next(&em->rb_node);
+ if (!next)
+ return NULL;
+ return container_of(next, struct extent_map, rb_node);
+}
+
+static struct extent_map *prev_extent_map(struct extent_map *em)
+{
+ struct rb_node *prev;
+
+ prev = rb_prev(&em->rb_node);
+ if (!prev)
+ return NULL;
+ return container_of(prev, struct extent_map, rb_node);
+}
+
/* helper for btfs_get_extent. Given an existing extent in the tree,
+ * the existing extent is the nearest extent to map_start,
* and an extent that you want to insert, deal with overlap and insert
- * the new extent into the tree.
+ * the best fitted new extent into the tree.
*/
static int merge_extent_mapping(struct extent_map_tree *em_tree,
struct extent_map *existing,
struct extent_map *em,
u64 map_start)
{
+ struct extent_map *prev;
+ struct extent_map *next;
+ u64 start;
+ u64 end;
u64 start_diff;

BUG_ON(map_start < em->start || map_start >= extent_map_end(em));
- start_diff = map_start - em->start;
- em->start = map_start;
- em->len = existing->start - em->start;
+
+ if (existing->start > map_start) {
+ next = existing;
+ prev = prev_extent_map(next);
+ } else {
+ prev = existing;
+ next = next_extent_map(prev);
+ }
+
+ start = prev ? extent_map_end(prev) : em->start;
+ start = max_t(u64, start, em->start);
+ end = next ? next->start : extent_map_end(em);
+ end = min_t(u64, end, extent_map_end(em));
+ start_diff = start - em->start;
+ em->start = start;
+ em->len = end - start;
if (em->block_start < EXTENT_MAP_LAST_BYTE &&
!test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) {
em->block_start += start_diff;
@@ -6482,25 +6521,21 @@ insert:

ret = 0;

- existing = lookup_extent_mapping(em_tree, start, len);
- if (existing && (existing->start > start ||
- existing->start + existing->len <= start)) {
+ existing = search_extent_mapping(em_tree, start, len);
+ /*
+ * existing will always be non-NULL, since there must be
+ * extent causing the -EEXIST.
+ */
+ if (start >= extent_map_end(existing) ||
+ start + len <= existing->start) {
+ /*
+ * The existing extent map is the one nearest to
+ * the [start, start + len) range which overlaps
+ */
+ err = merge_extent_mapping(em_tree, existing,
+ em, start);
free_extent_map(existing);
- existing = NULL;
- }
- if (!existing) {
- existing = lookup_extent_mapping(em_tree, em->start,
- em->len);
- if (existing) {
- err = merge_extent_mapping(em_tree, existing,
- em, start);
- free_extent_map(existing);
- if (err) {
- free_extent_map(em);
- em = NULL;
- }
- } else {
- err = -EIO;
+ if (err) {
free_extent_map(em);
em = NULL;
}

2014-10-28 03:36:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 015/146] pci_ids: Add support for Intel Quark ILB

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josef Ahmad <[email protected]>

commit bb048713bba3ead39f6112910906d9fe3f88ede7 upstream.

This patch adds the PCI id for Intel Quark ILB.
It will be used for GPIO and Multifunction device driver.

Signed-off-by: Josef Ahmad <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Chang Rebecca Swee Fun <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/pci_ids.h | 1 +
1 file changed, 1 insertion(+)

--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2557,6 +2557,7 @@
#define PCI_DEVICE_ID_INTEL_MFD_EMMC0 0x0823
#define PCI_DEVICE_ID_INTEL_MFD_EMMC1 0x0824
#define PCI_DEVICE_ID_INTEL_MRST_SD2 0x084F
+#define PCI_DEVICE_ID_INTEL_QUARK_X1000_ILB 0x095E
#define PCI_DEVICE_ID_INTEL_I960 0x0960
#define PCI_DEVICE_ID_INTEL_I960RM 0x0962
#define PCI_DEVICE_ID_INTEL_CENTERTON_ILB 0x0c60

2014-10-28 03:36:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 026/146] HID: wacom - remove report_id from wacom_get_report interface

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ping Cheng <[email protected]>

commit c64d883476812783e0400d37028756151d103e5c upstream.

It is assigned in buf[0] anyway.

Signed-off-by: Ping Cheng <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/wacom_sys.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/hid/wacom_sys.c
+++ b/drivers/hid/wacom_sys.c
@@ -23,13 +23,13 @@
#define WAC_CMD_ICON_BT_XFER 0x26
#define WAC_CMD_RETRIES 10

-static int wacom_get_report(struct hid_device *hdev, u8 type, u8 id,
- void *buf, size_t size, unsigned int retries)
+static int wacom_get_report(struct hid_device *hdev, u8 type, u8 *buf,
+ size_t size, unsigned int retries)
{
int retval;

do {
- retval = hid_hw_raw_request(hdev, id, buf, size, type,
+ retval = hid_hw_raw_request(hdev, buf[0], buf, size, type,
HID_REQ_GET_REPORT);
} while ((retval == -ETIMEDOUT || retval == -EPIPE) && --retries);

@@ -255,7 +255,7 @@ static int wacom_set_device_mode(struct
length, 1);
if (error >= 0)
error = wacom_get_report(hdev, HID_FEATURE_REPORT,
- report_id, rep_data, length, 1);
+ rep_data, length, 1);
} while ((error < 0 || rep_data[1] != mode) && limit++ < WAC_MSG_RETRIES);

kfree(rep_data);

2014-10-28 03:36:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 024/146] spi/rockchip: fix bug that cause the failure to read data in DMA mode

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Addy Ke <[email protected]>

commit a24e70c0ac146f8bcae3cdb7f514950d5b32219e upstream.

In my test on RK3288-pinky board, if spi is enabled, it will begin to
read data from slave regardless of whether the DMA is ready. So we
need prepare DMA before spi is enable.

Signed-off-by: Addy Ke <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/spi/spi-rockchip.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

--- a/drivers/spi/spi-rockchip.c
+++ b/drivers/spi/spi-rockchip.c
@@ -415,7 +415,7 @@ static void rockchip_spi_dma_txcb(void *
spin_unlock_irqrestore(&rs->lock, flags);
}

-static int rockchip_spi_dma_transfer(struct rockchip_spi *rs)
+static void rockchip_spi_prepare_dma(struct rockchip_spi *rs)
{
unsigned long flags;
struct dma_slave_config rxconf, txconf;
@@ -474,8 +474,6 @@ static int rockchip_spi_dma_transfer(str
dmaengine_submit(txdesc);
dma_async_issue_pending(rs->dma_tx.ch);
}
-
- return 1;
}

static void rockchip_spi_config(struct rockchip_spi *rs)
@@ -557,16 +555,17 @@ static int rockchip_spi_transfer_one(
else if (rs->rx)
rs->tmode = CR0_XFM_RO;

- if (master->can_dma && master->can_dma(master, spi, xfer))
+ /* we need prepare dma before spi was enabled */
+ if (master->can_dma && master->can_dma(master, spi, xfer)) {
rs->use_dma = 1;
- else
+ rockchip_spi_prepare_dma(rs);
+ } else {
rs->use_dma = 0;
+ }

rockchip_spi_config(rs);

- if (rs->use_dma)
- ret = rockchip_spi_dma_transfer(rs);
- else
+ if (!rs->use_dma)
ret = rockchip_spi_pio_transfer(rs);

return ret;

2014-10-28 03:36:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 028/146] HID: rmi: check sanity of the incoming report

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Benjamin Tissoires <[email protected]>

commit 5b65c2a0296644dd3dbdd590d6f00174d18c96b3 upstream.

In the Dell XPS 13 9333, it appears that sometimes the bus get confused
and corrupts the incoming data. It fills the input report with the
sentinel value "ff". Synaptics told us that such behavior does not comes
from the touchpad itself, so we filter out such reports here.

Unfortunately, we can not simply discard the incoming data because they
may contain useful information. Most of the time, the misbehavior is
quite near the end of the report, so we can still use the valid part of
it.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1123584

Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Andrew Duggan <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/hid-rmi.c | 44 ++++++++++++++++++++++++++++++++++++++------
1 file changed, 38 insertions(+), 6 deletions(-)

--- a/drivers/hid/hid-rmi.c
+++ b/drivers/hid/hid-rmi.c
@@ -320,10 +320,7 @@ static int rmi_f11_input_event(struct hi
int offset;
int i;

- if (size < hdata->f11.report_size)
- return 0;
-
- if (!(irq & hdata->f11.irq_mask))
+ if (!(irq & hdata->f11.irq_mask) || size <= 0)
return 0;

offset = (hdata->max_fingers >> 2) + 1;
@@ -332,9 +329,19 @@ static int rmi_f11_input_event(struct hi
int fs_bit_position = (i & 0x3) << 1;
int finger_state = (data[fs_byte_position] >> fs_bit_position) &
0x03;
+ int position = offset + 5 * i;
+
+ if (position + 5 > size) {
+ /* partial report, go on with what we received */
+ printk_once(KERN_WARNING
+ "%s %s: Detected incomplete finger report. Finger reports may occasionally get dropped on this platform.\n",
+ dev_driver_string(&hdev->dev),
+ dev_name(&hdev->dev));
+ hid_dbg(hdev, "Incomplete finger report\n");
+ break;
+ }

- rmi_f11_process_touch(hdata, i, finger_state,
- &data[offset + 5 * i]);
+ rmi_f11_process_touch(hdata, i, finger_state, &data[position]);
}
input_mt_sync_frame(hdata->input);
input_sync(hdata->input);
@@ -352,6 +359,11 @@ static int rmi_f30_input_event(struct hi
if (!(irq & hdata->f30.irq_mask))
return 0;

+ if (size < (int)hdata->f30.report_size) {
+ hid_warn(hdev, "Click Button pressed, but the click data is missing\n");
+ return 0;
+ }
+
for (i = 0; i < hdata->gpio_led_count; i++) {
if (test_bit(i, &hdata->button_mask)) {
value = (data[i / 8] >> (i & 0x07)) & BIT(0);
@@ -412,9 +424,29 @@ static int rmi_read_data_event(struct hi
return 1;
}

+static int rmi_check_sanity(struct hid_device *hdev, u8 *data, int size)
+{
+ int valid_size = size;
+ /*
+ * On the Dell XPS 13 9333, the bus sometimes get confused and fills
+ * the report with a sentinel value "ff". Synaptics told us that such
+ * behavior does not comes from the touchpad itself, so we filter out
+ * such reports here.
+ */
+
+ while ((data[valid_size - 1] == 0xff) && valid_size > 0)
+ valid_size--;
+
+ return valid_size;
+}
+
static int rmi_raw_event(struct hid_device *hdev,
struct hid_report *report, u8 *data, int size)
{
+ size = rmi_check_sanity(hdev, data, size);
+ if (size < 2)
+ return 0;
+
switch (data[0]) {
case RMI_READ_DATA_REPORT_ID:
return rmi_read_data_event(hdev, data, size);

2014-10-28 03:36:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 029/146] mpc85xx_edac: Make L2 interrupt shared too

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

commit a18c3f16a907b8977ef65fc8dd71ed3f7b751748 upstream.

The other two interrupt handlers in this driver are shared, except this
one. When loading the driver, it fails like this.

So make the IRQ line shared.

Freescale(R) MPC85xx EDAC driver, (C) 2006 Montavista Software
mpc85xx_mc_err_probe: No ECC DIMMs discovered
EDAC DEVICE0: Giving out device to module MPC85xx_edac controller mpc85xx_l2_err: DEV mpc85xx_l2_err (INTERRUPT)
genirq: Flags mismatch irq 16. 00000000 ([EDAC] L2 err) vs. 00000080 ([EDAC] PCI err)
mpc85xx_l2_err_probe: Unable to request irq 16 for MPC85xx L2 err
remove_proc_entry: removing non-empty directory 'irq/16', leaking at least 'aerdrv'
------------[ cut here ]------------
WARNING: at fs/proc/generic.c:521
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc5-dirty #1
task: ee058000 ti: ee046000 task.ti: ee046000
NIP: c016c0c4 LR: c016c0c4 CTR: c037b51c
REGS: ee047c10 TRAP: 0700 Not tainted (3.17.0-rc5-dirty)
MSR: 00029000 <CE,EE,ME> CR: 22008022 XER: 20000000

GPR00: c016c0c4 ee047cc0 ee058000 00000053 00029000 00000000 c037c744 00000003
GPR08: c09aab28 c09aab24 c09aab28 00000156 20008028 00000000 c0002ac8 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000139 c0950394
GPR24: c09f0000 ee5585b0 ee047d08 c0a10000 ee047d08 ee15f808 00000002 ee03f660
NIP [c016c0c4] remove_proc_entry
LR [c016c0c4] remove_proc_entry
Call Trace:
remove_proc_entry (unreliable)
unregister_irq_proc
free_desc
irq_free_descs
mpc85xx_l2_err_probe
platform_drv_probe
really_probe
__driver_attach
bus_for_each_dev
bus_add_driver
driver_register
mpc85xx_mc_init
do_one_initcall
kernel_init_freeable
kernel_init
ret_from_kernel_thread
Instruction dump: ...

Reported-and-tested-by: <[email protected]>
Acked-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/edac/mpc85xx_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -633,7 +633,7 @@ static int mpc85xx_l2_err_probe(struct p
if (edac_op_state == EDAC_OPSTATE_INT) {
pdata->irq = irq_of_parse_and_map(op->dev.of_node, 0);
res = devm_request_irq(&op->dev, pdata->irq,
- mpc85xx_l2_isr, 0,
+ mpc85xx_l2_isr, IRQF_SHARED,
"[EDAC] L2 err", edac_dev);
if (res < 0) {
printk(KERN_ERR

2014-10-28 03:37:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 035/146] regulator: ltc3589: fix broken voltage transitions

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steffen Trumtrar <[email protected]>

commit c5bb725ac2d1a13e9e766bf9a16bac986ade17cd upstream.

VCCR is used as a trigger to start voltage transitions, so
we need to mark it volatile in order to make sure it gets
written to hardware every time we set a new voltage.

Fixes regulator voltage being stuck at the first voltage
set after driver load.

[lst: reworded commit message]
Signed-off-by: Steffen Trumtrar <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/regulator/ltc3589.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/regulator/ltc3589.c
+++ b/drivers/regulator/ltc3589.c
@@ -372,6 +372,7 @@ static bool ltc3589_volatile_reg(struct
switch (reg) {
case LTC3589_IRQSTAT:
case LTC3589_PGSTAT:
+ case LTC3589_VCCR:
return true;
}
return false;

2014-10-28 03:37:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 020/146] kvm: dont take vcpu mutex for obviously invalid vcpu ioctls

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Matlack <[email protected]>

commit 2ea75be3219571d0ec009ce20d9971e54af96e09 upstream.

vcpu ioctls can hang the calling thread if issued while a vcpu is running.
However, invalid ioctls can happen when userspace tries to probe the kind
of file descriptors (e.g. isatty() calls ioctl(TCGETS)); in that case,
we know the ioctl is going to be rejected as invalid anyway and we can
fail before trying to take the vcpu mutex.

This patch does not change functionality, it just makes invalid ioctls
fail faster.

Signed-off-by: David Matlack <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
virt/kvm/kvm_main.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -52,6 +52,7 @@

#include <asm/processor.h>
#include <asm/io.h>
+#include <asm/ioctl.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>

@@ -1991,6 +1992,9 @@ static long kvm_vcpu_ioctl(struct file *
if (vcpu->kvm->mm != current->mm)
return -EIO;

+ if (unlikely(_IOC_TYPE(ioctl) != KVMIO))
+ return -EINVAL;
+
#if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS)
/*
* Special cases: vcpu ioctls that are asynchronous to vcpu execution,

2014-10-28 03:37:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 023/146] spi: dw-mid: respect 8 bit mode

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Shevchenko <[email protected]>

commit b41583e7299046abdc578c33f25ed83ee95b9b31 upstream.

In case of 8 bit mode and DMA usage we end up with every second byte written as
0. We have to respect bits_per_word settings what this patch actually does.

Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/spi/spi-dw-mid.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/spi/spi-dw-mid.c
+++ b/drivers/spi/spi-dw-mid.c
@@ -136,7 +136,7 @@ static int mid_spi_dma_transfer(struct d
txconf.dst_addr = dws->dma_addr;
txconf.dst_maxburst = LNW_DMA_MSIZE_16;
txconf.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
- txconf.dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
+ txconf.dst_addr_width = dws->dma_width;
txconf.device_fc = false;

txchan->device->device_control(txchan, DMA_SLAVE_CONFIG,
@@ -159,7 +159,7 @@ static int mid_spi_dma_transfer(struct d
rxconf.src_addr = dws->dma_addr;
rxconf.src_maxburst = LNW_DMA_MSIZE_16;
rxconf.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
- rxconf.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
+ rxconf.src_addr_width = dws->dma_width;
rxconf.device_fc = false;

rxchan->device->device_control(rxchan, DMA_SLAVE_CONFIG,

2014-10-28 03:37:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 048/146] Drivers: hv: vmbus: Cleanup vmbus_establish_gpadl()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit 72c6b71c245dac8f371167d97ef471b367d0b66b upstream.

Eliminate the call to BUG_ON() by waiting for the host to respond. We are
trying to reclaim the ownership of memory that was given to the host and so
we will have to wait until the host responds.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Tested-by: Sitsofe Wheeler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/channel.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -363,7 +363,6 @@ int vmbus_establish_gpadl(struct vmbus_c
u32 next_gpadl_handle;
unsigned long flags;
int ret = 0;
- int t;

next_gpadl_handle = atomic_read(&vmbus_connection.next_gpadl_handle);
atomic_inc(&vmbus_connection.next_gpadl_handle);
@@ -410,9 +409,7 @@ int vmbus_establish_gpadl(struct vmbus_c

}
}
- t = wait_for_completion_timeout(&msginfo->waitevent, 5*HZ);
- BUG_ON(t == 0);
-
+ wait_for_completion(&msginfo->waitevent);

/* At this point, we received the gpadl created msg */
*gpadl_handle = gpadlmsg->gpadl;

2014-10-28 03:37:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 051/146] mei: bus: fix possible boundaries violation

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexander Usyskin <[email protected]>

commit cfda2794b5afe7ce64ee9605c64bef0e56a48125 upstream.

function 'strncpy' will fill whole buffer 'id.name' of fixed size (32)
with string value and will not leave place for NULL-terminator.
Possible buffer boundaries violation in following string operations.
Replace strncpy with strlcpy.

Signed-off-by: Alexander Usyskin <[email protected]>
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/misc/mei/bus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -70,7 +70,7 @@ static int mei_cl_device_probe(struct de

dev_dbg(dev, "Device probe\n");

- strncpy(id.name, dev_name(dev), sizeof(id.name));
+ strlcpy(id.name, dev_name(dev), sizeof(id.name));

return driver->probe(device, &id);
}

2014-10-28 03:37:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 054/146] Documentation: lzo: document part of the encoding

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <[email protected]>

commit d98a0526434d27e261f622cf9d2e0028b5ff1a00 upstream.

Add a complete description of the LZO format as processed by the
decompressor. I have not found a public specification of this format
hence this analysis, which will be used to better understand the code.

Cc: Willem Pinckaers <[email protected]>
Cc: "Don A. Bailey" <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
Documentation/lzo.txt | 164 ++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 164 insertions(+)

--- /dev/null
+++ b/Documentation/lzo.txt
@@ -0,0 +1,164 @@
+
+LZO stream format as understood by Linux's LZO decompressor
+===========================================================
+
+Introduction
+
+ This is not a specification. No specification seems to be publicly available
+ for the LZO stream format. This document describes what input format the LZO
+ decompressor as implemented in the Linux kernel understands. The file subject
+ of this analysis is lib/lzo/lzo1x_decompress_safe.c. No analysis was made on
+ the compressor nor on any other implementations though it seems likely that
+ the format matches the standard one. The purpose of this document is to
+ better understand what the code does in order to propose more efficient fixes
+ for future bug reports.
+
+Description
+
+ The stream is composed of a series of instructions, operands, and data. The
+ instructions consist in a few bits representing an opcode, and bits forming
+ the operands for the instruction, whose size and position depend on the
+ opcode and on the number of literals copied by previous instruction. The
+ operands are used to indicate :
+
+ - a distance when copying data from the dictionary (past output buffer)
+ - a length (number of bytes to copy from dictionary)
+ - the number of literals to copy, which is retained in variable "state"
+ as a piece of information for next instructions.
+
+ Optionally depending on the opcode and operands, extra data may follow. These
+ extra data can be a complement for the operand (eg: a length or a distance
+ encoded on larger values), or a literal to be copied to the output buffer.
+
+ The first byte of the block follows a different encoding from other bytes, it
+ seems to be optimized for literal use only, since there is no dictionary yet
+ prior to that byte.
+
+ Lengths are always encoded on a variable size starting with a small number
+ of bits in the operand. If the number of bits isn't enough to represent the
+ length, up to 255 may be added in increments by consuming more bytes with a
+ rate of at most 255 per extra byte (thus the compression ratio cannot exceed
+ around 255:1). The variable length encoding using #bits is always the same :
+
+ length = byte & ((1 << #bits) - 1)
+ if (!length) {
+ length = ((1 << #bits) - 1)
+ length += 255*(number of zero bytes)
+ length += first-non-zero-byte
+ }
+ length += constant (generally 2 or 3)
+
+ For references to the dictionary, distances are relative to the output
+ pointer. Distances are encoded using very few bits belonging to certain
+ ranges, resulting in multiple copy instructions using different encodings.
+ Certain encodings involve one extra byte, others involve two extra bytes
+ forming a little-endian 16-bit quantity (marked LE16 below).
+
+ After any instruction except the large literal copy, 0, 1, 2 or 3 literals
+ are copied before starting the next instruction. The number of literals that
+ were copied may change the meaning and behaviour of the next instruction. In
+ practice, only one instruction needs to know whether 0, less than 4, or more
+ literals were copied. This is the information stored in the <state> variable
+ in this implementation. This number of immediate literals to be copied is
+ generally encoded in the last two bits of the instruction but may also be
+ taken from the last two bits of an extra operand (eg: distance).
+
+ End of stream is declared when a block copy of distance 0 is seen. Only one
+ instruction may encode this distance (0001HLLL), it takes one LE16 operand
+ for the distance, thus requiring 3 bytes.
+
+ IMPORTANT NOTE : in the code some length checks are missing because certain
+ instructions are called under the assumption that a certain number of bytes
+ follow because it has already been garanteed before parsing the instructions.
+ They just have to "refill" this credit if they consume extra bytes. This is
+ an implementation design choice independant on the algorithm or encoding.
+
+Byte sequences
+
+ First byte encoding :
+
+ 0..17 : follow regular instruction encoding, see below. It is worth
+ noting that codes 16 and 17 will represent a block copy from
+ the dictionary which is empty, and that they will always be
+ invalid at this place.
+
+ 18..21 : copy 0..3 literals
+ state = (byte - 17) = 0..3 [ copy <state> literals ]
+ skip byte
+
+ 22..255 : copy literal string
+ length = (byte - 17) = 4..238
+ state = 4 [ don't copy extra literals ]
+ skip byte
+
+ Instruction encoding :
+
+ 0 0 0 0 X X X X (0..15)
+ Depends on the number of literals copied by the last instruction.
+ If last instruction did not copy any literal (state == 0), this
+ encoding will be a copy of 4 or more literal, and must be interpreted
+ like this :
+
+ 0 0 0 0 L L L L (0..15) : copy long literal string
+ length = 3 + (L ?: 15 + (zero_bytes * 255) + non_zero_byte)
+ state = 4 (no extra literals are copied)
+
+ If last instruction used to copy between 1 to 3 literals (encoded in
+ the instruction's opcode or distance), the instruction is a copy of a
+ 2-byte block from the dictionary within a 1kB distance. It is worth
+ noting that this instruction provides little savings since it uses 2
+ bytes to encode a copy of 2 other bytes but it encodes the number of
+ following literals for free. It must be interpreted like this :
+
+ 0 0 0 0 D D S S (0..15) : copy 2 bytes from <= 1kB distance
+ length = 2
+ state = S (copy S literals after this block)
+ Always followed by exactly one byte : H H H H H H H H
+ distance = (H << 2) + D + 1
+
+ If last instruction used to copy 4 or more literals (as detected by
+ state == 4), the instruction becomes a copy of a 3-byte block from the
+ dictionary from a 2..3kB distance, and must be interpreted like this :
+
+ 0 0 0 0 D D S S (0..15) : copy 3 bytes from 2..3 kB distance
+ length = 3
+ state = S (copy S literals after this block)
+ Always followed by exactly one byte : H H H H H H H H
+ distance = (H << 2) + D + 2049
+
+ 0 0 0 1 H L L L (16..31)
+ Copy of a block within 16..48kB distance (preferably less than 10B)
+ length = 2 + (L ?: 7 + (zero_bytes * 255) + non_zero_byte)
+ Always followed by exactly one LE16 : D D D D D D D D : D D D D D D S S
+ distance = 16384 + (H << 14) + D
+ state = S (copy S literals after this block)
+ End of stream is reached if distance == 16384
+
+ 0 0 1 L L L L L (32..63)
+ Copy of small block within 16kB distance (preferably less than 34B)
+ length = 2 + (L ?: 31 + (zero_bytes * 255) + non_zero_byte)
+ Always followed by exactly one LE16 : D D D D D D D D : D D D D D D S S
+ distance = D + 1
+ state = S (copy S literals after this block)
+
+ 0 1 L D D D S S (64..127)
+ Copy 3-4 bytes from block within 2kB distance
+ state = S (copy S literals after this block)
+ length = 3 + L
+ Always followed by exactly one byte : H H H H H H H H
+ distance = (H << 3) + D + 1
+
+ 1 L L D D D S S (128..255)
+ Copy 5-8 bytes from block within 2kB distance
+ state = S (copy S literals after this block)
+ length = 5 + L
+ Always followed by exactly one byte : H H H H H H H H
+ distance = (H << 3) + D + 1
+
+Authors
+
+ This document was written by Willy Tarreau <[email protected]> on 2014/07/19 during an
+ analysis of the decompression code available in Linux 3.16-rc5. The code is
+ tricky, it is possible that this document contains mistakes or that a few
+ corner cases were overlooked. In any case, please report any doubt, fix, or
+ proposed updates to the author(s) so that the document can be updated.

2014-10-28 03:38:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 041/146] dmaengine: pl330: Fix NULL pointer dereference on driver unbind

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Krzysztof Kozlowski <[email protected]>

commit 6e4a2a83f95826201bbd89f55522537ea52d1d67 upstream.

Fix a NULL pointer dereference after unbinding the driver, if channel
resources were not yet allocated (no call to
pl330_alloc_chan_resources()):
$ echo 12850000.mdma > /sys/bus/amba/drivers/dma-pl330/unbind
[ 13.606533] DMA pl330_control: removing pch: eeab6800, chan: eeab6814, thread: (null)
[ 13.614472] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
[ 13.622537] pgd = ee284000
[ 13.625228] [0000000c] *pgd=6e1e4831, *pte=00000000, *ppte=00000000
[ 13.631482] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[ 13.636859] Modules linked in:
[ 13.639903] CPU: 0 PID: 1 Comm: sh Not tainted 3.17.0-rc3-next-20140904-00004-g7020ffc33ca3-dirty #420
[ 13.649187] task: ee80a800 ti: ee888000 task.ti: ee888000
[ 13.654589] PC is at _stop+0x8/0x2c8
[ 13.658131] LR is at pl330_control+0x70/0x2e8
[ 13.662468] pc : [<c0206028>] lr : [<c020649c>] psr: 60000093
[ 13.662468] sp : ee889e58 ip : 00000001 fp : 000bab70
[ 13.673922] r10: eeab6814 r9 : ee16debc r8 : 00000000
[ 13.679131] r7 : eeab685c r6 : 60000013 r5 : ee16de10 r4 : eeab6800
[ 13.685641] r3 : 00000002 r2 : 00000000 r1 : 00010000 r0 : 00000000
[ 13.692153] Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 13.699357] Control: 10c5387d Table: 6e28404a DAC: 00000015
[ 13.705085] Process sh (pid: 1, stack limit = 0xee888240)
[ 13.710466] Stack: (0xee889e58 to 0xee88a000)
[ 13.714808] 9e40: 00000002 eeab6800
[ 13.722969] 9e60: ee16de10 eeab6800 ee16de10 60000013 eeab685c c020649c 00000000 c040280c
[ 13.731128] 9e80: ee889e80 ee889e80 ee16de18 ee16de10 eeab6880 eeab6814 00200200 eeab68a8
[ 13.739287] 9ea0: 00100100 c0208048 00000000 c0409fc4 eea80800 eea808f8 c0605c44 0000000e
[ 13.747446] 9ec0: 0000000e eeb3960c eeb39600 c0203c48 eea80800 c0605c44 c0605a8c c023f694
[ 13.755605] 9ee0: ee80a800 eea80834 eea80800 c023f704 ee80a800 eea80800 c0605c44 c023e8ec
[ 13.763764] 9f00: 0000000e ee149780 ee29e580 ee889f80 ee29e580 c023e19c 0000000e c01167e4
[ 13.771923] 9f20: c01167a0 00000000 00000000 c0115e88 00000000 00000000 ee0b1a00 0000000e
[ 13.780082] 9f40: b6f48000 ee889f80 0000000e ee888000 b6f48000 c00bfadc 00000000 00000003
[ 13.788241] 9f60: 00000000 00000000 00000000 ee0b1a00 ee0b1a00 0000000e b6f48000 c00bfdf4
[ 13.796401] 9f80: 00000000 00000000 ffffffff 0000000e b6f48000 b6edc5d0 00000004 c000e7a4
[ 13.804560] 9fa0: 00000000 c000e620 0000000e b6f48000 00000001 b6f48000 0000000e 00000000
[ 13.812719] 9fc0: 0000000e b6f48000 b6edc5d0 00000004 0000000e b6f4c8c0 000c3470 000bab70
[ 13.820879] 9fe0: 00000000 bed2aa50 b6e18bdc b6e6b52c 60000010 00000001 c0c0c0c0 c0c0c0c0
[ 13.829058] [<c0206028>] (_stop) from [<c020649c>] (pl330_control+0x70/0x2e8)
[ 13.836165] [<c020649c>] (pl330_control) from [<c0208048>] (pl330_remove+0xb0/0xdc)
[ 13.843800] [<c0208048>] (pl330_remove) from [<c0203c48>] (amba_remove+0x24/0xc0)
[ 13.851272] [<c0203c48>] (amba_remove) from [<c023f694>] (__device_release_driver+0x70/0xc4)
[ 13.859685] [<c023f694>] (__device_release_driver) from [<c023f704>] (device_release_driver+0x1c/0x28)
[ 13.868971] [<c023f704>] (device_release_driver) from [<c023e8ec>] (unbind_store+0x58/0x90)
[ 13.877303] [<c023e8ec>] (unbind_store) from [<c023e19c>] (drv_attr_store+0x20/0x2c)
[ 13.885036] [<c023e19c>] (drv_attr_store) from [<c01167e4>] (sysfs_kf_write+0x44/0x48)
[ 13.892928] [<c01167e4>] (sysfs_kf_write) from [<c0115e88>] (kernfs_fop_write+0xc0/0x17c)
[ 13.901090] [<c0115e88>] (kernfs_fop_write) from [<c00bfadc>] (vfs_write+0xa0/0x1a8)
[ 13.908812] [<c00bfadc>] (vfs_write) from [<c00bfdf4>] (SyS_write+0x40/0x8c)
[ 13.915850] [<c00bfdf4>] (SyS_write) from [<c000e620>] (ret_fast_syscall+0x0/0x30)
[ 13.923392] Code: e5813010 e12fff1e e92d40f0 e24dd00c (e590200c)
[ 13.929467] ---[ end trace 10064e15a5929cf8 ]---

Terminate the thread and free channel resource only if channel resources
were allocated (thread is not NULL).

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: b3040e40675e ("DMA: PL330: Add dma api driver")
Reviewed-by: Lars-Peter Clausen <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/dma/pl330.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2784,8 +2784,10 @@ static int pl330_remove(struct amba_devi
list_del(&pch->chan.device_node);

/* Flush the channel */
- pl330_control(&pch->chan, DMA_TERMINATE_ALL, 0);
- pl330_free_chan_resources(&pch->chan);
+ if (pch->thread) {
+ pl330_control(&pch->chan, DMA_TERMINATE_ALL, 0);
+ pl330_free_chan_resources(&pch->chan);
+ }
}

pl330_del(pl330);

2014-10-28 03:38:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 059/146] NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <[email protected]>

commit a4339b7b686b4acc8b6de2b07d7bacbe3ae44b83 upstream.

If a NFSv4.x server returns NFS4ERR_STALE_CLIENTID in response to a
CREATE_SESSION or SETCLIENTID_CONFIRM in order to tell us that it rebooted
a second time, then the client will currently take this to mean that it must
declare all locks to be stale, and hence ineligible for reboot recovery.

RFC3530 and RFC5661 both suggest that the client should instead rely on the
server to respond to inelegible open share, lock and delegation reclaim
requests with NFS4ERR_NO_GRACE in this situation.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/nfs4state.c | 1 -
1 file changed, 1 deletion(-)

--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1761,7 +1761,6 @@ static int nfs4_handle_reclaim_lease_err
break;
case -NFS4ERR_STALE_CLIENTID:
clear_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
- nfs4_state_clear_reclaim_reboot(clp);
nfs4_state_start_reclaim_reboot(clp);
break;
case -NFS4ERR_CLID_INUSE:

2014-10-28 03:38:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 044/146] Drivers: hv: util: Properly pack the data for file copy functionality

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit bc5a5b02331a3175a5fca20a4beba249e573b672 upstream.

Properly pack the data for file copy functionality. Patch based on
investigation done by Matej Muzila <[email protected]>

Signed-off-by: K. Y. Srinivasan <[email protected]>
Reported-by: <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/uapi/linux/hyperv.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/uapi/linux/hyperv.h
+++ b/include/uapi/linux/hyperv.h
@@ -137,7 +137,7 @@ struct hv_do_fcopy {
__u64 offset;
__u32 size;
__u8 data[DATA_FRAGMENT];
-};
+} __attribute__((packed));

/*
* An implementation of HyperV key value pair (KVP) functionality for Linux.

2014-10-28 03:38:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 046/146] Drivers: hv: vmbus: Cleanup vmbus_teardown_gpadl()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit 66be653083057358724d56d817e870e53fb81ca7 upstream.

Eliminate calls to BUG_ON() by properly handling errors. In cases where
rollback is possible, we will return the appropriate error to have the
calling code decide how to rollback state. In the case where we are
transferring ownership of the guest physical pages to the host,
we will wait for the host to respond.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Tested-by: Sitsofe Wheeler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/channel.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -435,7 +435,7 @@ int vmbus_teardown_gpadl(struct vmbus_ch
struct vmbus_channel_gpadl_teardown *msg;
struct vmbus_channel_msginfo *info;
unsigned long flags;
- int ret, t;
+ int ret;

info = kmalloc(sizeof(*info) +
sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL);
@@ -457,11 +457,12 @@ int vmbus_teardown_gpadl(struct vmbus_ch
ret = vmbus_post_msg(msg,
sizeof(struct vmbus_channel_gpadl_teardown));

- BUG_ON(ret != 0);
- t = wait_for_completion_timeout(&info->waitevent, 5*HZ);
- BUG_ON(t == 0);
+ if (ret)
+ goto post_msg_err;

- /* Received a torndown response */
+ wait_for_completion(&info->waitevent);
+
+post_msg_err:
spin_lock_irqsave(&vmbus_connection.channelmsg_lock, flags);
list_del(&info->msglistentry);
spin_unlock_irqrestore(&vmbus_connection.channelmsg_lock, flags);

2014-10-28 03:38:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 062/146] nfsd4: reserve adequate space for LOCK op

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "J. Bruce Fields" <[email protected]>

commit f7b43d0c992c3ec3e8d9285c3fb5e1e0eb0d031a upstream.

As of 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
on space", we permit the server to process a LOCK operation even if
there might not be space to return the conflicting lockowner, because
we've made returning the conflicting lockowner optional.

However, the rpc server still wants to know the most we might possibly
return, so we need to take into account the possible conflicting
lockowner in the svc_reserve_space() call here.

Symptoms were log messages like "RPC request reserved 88 but used 108".

Fixes: 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low on space"
Reported-by: Kinglong Mee <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfsd/nfs4xdr.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1670,6 +1670,14 @@ nfsd4_decode_compound(struct nfsd4_compo
readbytes += nfsd4_max_reply(argp->rqstp, op);
} else
max_reply += nfsd4_max_reply(argp->rqstp, op);
+ /*
+ * OP_LOCK may return a conflicting lock. (Special case
+ * because it will just skip encoding this if it runs
+ * out of xdr buffer space, and it is the only operation
+ * that behaves this way.)
+ */
+ if (op->opnum == OP_LOCK)
+ max_reply += NFS4_OPAQUE_LIMIT;

if (op->status) {
argp->opcnt = i+1;

2014-10-28 03:38:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 064/146] NFS: Fix a bogus warning in nfs_generic_pgio

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <[email protected]>

commit b8fb9c30f25e45dab5d2cd310ab6913b6861d00f upstream.

It is OK for pageused == pagecount in the loop, as long as we don't add
another entry to the *pages array. Move the test so that it only triggers
in that case.

Reported-by: Steve Dickson <[email protected]>
Fixes: bba5c1887a92 (nfs: disallow duplicate pages in pgio page vectors)
Cc: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/pagelist.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -744,12 +744,11 @@ int nfs_generic_pgio(struct nfs_pageio_d
nfs_list_remove_request(req);
nfs_list_add_request(req, &hdr->pages);

- if (WARN_ON_ONCE(pageused >= pagecount))
- return nfs_pgio_error(desc, hdr);
-
if (!last_page || last_page != req->wb_page) {
- *pages++ = last_page = req->wb_page;
pageused++;
+ if (pageused > pagecount)
+ break;
+ *pages++ = last_page = req->wb_page;
}
}
if (WARN_ON_ONCE(pageused != pagecount))

2014-10-28 03:38:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 067/146] iwlwifi: Add missing PCI IDs for the 7260 series

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Oren Givon <[email protected]>

commit 4f08970f5284dce486f0e2290834aefb2a262189 upstream.

Add 4 missing PCI IDs for the 7260 series.

Signed-off-by: Oren Givon <[email protected]>
Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/iwlwifi/pcie/drv.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/net/wireless/iwlwifi/pcie/drv.c
+++ b/drivers/net/wireless/iwlwifi/pcie/drv.c
@@ -273,6 +273,8 @@ static const struct pci_device_id iwl_hw
{IWL_PCI_DEVICE(0x08B1, 0x4070, iwl7260_2ac_cfg)},
{IWL_PCI_DEVICE(0x08B1, 0x4072, iwl7260_2ac_cfg)},
{IWL_PCI_DEVICE(0x08B1, 0x4170, iwl7260_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x08B1, 0x4C60, iwl7260_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x08B1, 0x4C70, iwl7260_2ac_cfg)},
{IWL_PCI_DEVICE(0x08B1, 0x4060, iwl7260_2n_cfg)},
{IWL_PCI_DEVICE(0x08B1, 0x406A, iwl7260_2n_cfg)},
{IWL_PCI_DEVICE(0x08B1, 0x4160, iwl7260_2n_cfg)},
@@ -316,6 +318,8 @@ static const struct pci_device_id iwl_hw
{IWL_PCI_DEVICE(0x08B1, 0xC770, iwl7260_2ac_cfg)},
{IWL_PCI_DEVICE(0x08B1, 0xC760, iwl7260_2n_cfg)},
{IWL_PCI_DEVICE(0x08B2, 0xC270, iwl7260_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x08B1, 0xCC70, iwl7260_2ac_cfg)},
+ {IWL_PCI_DEVICE(0x08B1, 0xCC60, iwl7260_2ac_cfg)},
{IWL_PCI_DEVICE(0x08B2, 0xC272, iwl7260_2ac_cfg)},
{IWL_PCI_DEVICE(0x08B2, 0xC260, iwl7260_2n_cfg)},
{IWL_PCI_DEVICE(0x08B2, 0xC26A, iwl7260_n_cfg)},

2014-10-28 03:38:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 061/146] NFSv4.1: Fix an NFSv4.1 state renewal regression

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Adamson <[email protected]>

commit d1f456b0b9545f1606a54cd17c20775f159bd2ce upstream.

Commit 2f60ea6b8ced ("NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation") set the NFS4_RENEW_TIMEOUT flag in nfs4_renew_state, and does
not put an nfs41_proc_async_sequence call, the NFSv4.1 lease renewal heartbeat
call, on the wire to renew the NFSv4.1 state if the flag was not set.

The NFS4_RENEW_TIMEOUT flag is set when "now" is after the last renewal
(cl_last_renewal) plus the lease time divided by 3. This is arbitrary and
sometimes does the following:

In normal operation, the only way a future state renewal call is put on the
wire is via a call to nfs4_schedule_state_renewal, which schedules a
nfs4_renew_state workqueue task. nfs4_renew_state determines if the
NFS4_RENEW_TIMEOUT should be set, and the calls nfs41_proc_async_sequence,
which only gets sent if the NFS4_RENEW_TIMEOUT flag is set.
Then the nfs41_proc_async_sequence rpc_release function schedules
another state remewal via nfs4_schedule_state_renewal.

Without this change we can get into a state where an application stops
accessing the NFSv4.1 share, state renewal calls stop due to the
NFS4_RENEW_TIMEOUT flag _not_ being set. The only way to recover
from this situation is with a clientid re-establishment, once the application
resumes and the server has timed out the lease and so returns
NFS4ERR_BAD_SESSION on the subsequent SEQUENCE operation.

An example application:
open, lock, write a file.

sleep for 6 * lease (could be less)

ulock, close.

In the above example with NFSv4.1 delegations enabled, without this change,
there are no OP_SEQUENCE state renewal calls during the sleep, and the
clientid is recovered due to lease expiration on the close.

This issue does not occur with NFSv4.1 delegations disabled, nor with
NFSv4.0, with or without delegations enabled.

Signed-off-by: Andy Adamson <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 2f60ea6b8ced (NFSv4: The NFSv4.0 client must send RENEW calls...)
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/nfs4proc.c | 2 +-
fs/nfs/nfs4renewd.c | 12 ++++++++++--
2 files changed, 11 insertions(+), 3 deletions(-)

--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7353,7 +7353,7 @@ static int nfs41_proc_async_sequence(str
int ret = 0;

if ((renew_flags & NFS4_RENEW_TIMEOUT) == 0)
- return 0;
+ return -EAGAIN;
task = _nfs41_proc_sequence(clp, cred, false);
if (IS_ERR(task))
ret = PTR_ERR(task);
--- a/fs/nfs/nfs4renewd.c
+++ b/fs/nfs/nfs4renewd.c
@@ -88,10 +88,18 @@ nfs4_renew_state(struct work_struct *wor
}
nfs_expire_all_delegations(clp);
} else {
+ int ret;
+
/* Queue an asynchronous RENEW. */
- ops->sched_state_renewal(clp, cred, renew_flags);
+ ret = ops->sched_state_renewal(clp, cred, renew_flags);
put_rpccred(cred);
- goto out_exp;
+ switch (ret) {
+ default:
+ goto out_exp;
+ case -EAGAIN:
+ case -ENOMEM:
+ break;
+ }
}
} else {
dprintk("%s: failed to call renewd. Reason: lease not expired \n",

2014-10-28 03:38:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 066/146] iwlwifi: mvm: disable BT Co-running by default

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Emmanuel Grumbach <[email protected]>

commit 9b60bb6d86496af1adc753795de2c12c4499868a upstream.

The tables still contain dummy values.

Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/iwlwifi/mvm/constants.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/wireless/iwlwifi/mvm/constants.h
+++ b/drivers/net/wireless/iwlwifi/mvm/constants.h
@@ -82,7 +82,7 @@
#define IWL_MVM_BT_COEX_EN_RED_TXP_THRESH 62
#define IWL_MVM_BT_COEX_DIS_RED_TXP_THRESH 65
#define IWL_MVM_BT_COEX_SYNC2SCO 1
-#define IWL_MVM_BT_COEX_CORUNNING 1
+#define IWL_MVM_BT_COEX_CORUNNING 0
#define IWL_MVM_BT_COEX_MPLUT 1

#endif /* __MVM_CONSTANTS_H */

2014-10-28 03:39:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 073/146] ALSA: emu10k1: Fix deadlock in synth voice lookup

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <[email protected]>

commit 95926035b187cc9fee6fb61385b7da9c28123f74 upstream.

The emu10k1 voice allocator takes voice_lock spinlock. When there is
no empty stream available, it tries to release a voice used by synth,
and calls get_synth_voice. The callback function,
snd_emu10k1_synth_get_voice(), however, also takes the voice_lock,
thus it deadlocks.

The fix is simply removing the voice_lock holds in
snd_emu10k1_synth_get_voice(), as this is always called in the
spinlock context.

Reported-and-tested-by: Arthur Marsh <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/pci/emu10k1/emu10k1_callback.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

--- a/sound/pci/emu10k1/emu10k1_callback.c
+++ b/sound/pci/emu10k1/emu10k1_callback.c
@@ -85,6 +85,8 @@ snd_emu10k1_ops_setup(struct snd_emux *e
* get more voice for pcm
*
* terminate most inactive voice and give it as a pcm voice.
+ *
+ * voice_lock is already held.
*/
int
snd_emu10k1_synth_get_voice(struct snd_emu10k1 *hw)
@@ -92,12 +94,10 @@ snd_emu10k1_synth_get_voice(struct snd_e
struct snd_emux *emu;
struct snd_emux_voice *vp;
struct best_voice best[V_END];
- unsigned long flags;
int i;

emu = hw->synth;

- spin_lock_irqsave(&emu->voice_lock, flags);
lookup_voices(emu, hw, best, 1); /* no OFF voices */
for (i = 0; i < V_END; i++) {
if (best[i].voice >= 0) {
@@ -113,11 +113,9 @@ snd_emu10k1_synth_get_voice(struct snd_e
vp->emu->num_voices--;
vp->ch = -1;
vp->state = SNDRV_EMUX_ST_OFF;
- spin_unlock_irqrestore(&emu->voice_lock, flags);
return ch;
}
}
- spin_unlock_irqrestore(&emu->voice_lock, flags);

/* not found */
return -ENOMEM;

2014-10-28 03:39:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 074/146] ALSA: ALC283 codec - Avoid pop noise on headphones during suspend/resume

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Harsha Priya <[email protected]>

commit b450b17c156e264bc44a198046d3ebaaef5a041d upstream.

This patch sets the headphones mode to default before suspending
which helps avoid the pop noise on headphones

Signed-off-by: Harsha Priya <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/pci/hda/patch_realtek.c | 3 +++
1 file changed, 3 insertions(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -3125,6 +3125,9 @@ static void alc283_shutup(struct hda_cod

alc_write_coef_idx(codec, 0x43, 0x9004);

+ /*depop hp during suspend*/
+ alc_write_coef_idx(codec, 0x06, 0x2100);
+
snd_hda_codec_write(codec, hp_pin, 0,
AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE);


2014-10-28 03:39:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 078/146] ALSA: hda - Add missing terminating entry to SND_HDA_PIN_QUIRK macro

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Henningsson <[email protected]>

commit fb54a645b2739fb196446ffbbbe3f3589d117b55 upstream.

Without this terminating entry, the pin matching would continue
across random memory until a zero or a non-matching entry was found.

The result being that in some cases, the pin quirk would not be
applied correctly.

Signed-off-by: David Henningsson <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/pci/hda/hda_local.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/sound/pci/hda/hda_local.h
+++ b/sound/pci/hda/hda_local.h
@@ -425,7 +425,7 @@ struct snd_hda_pin_quirk {
.subvendor = _subvendor,\
.name = _name,\
.value = _value,\
- .pins = (const struct hda_pintbl[]) { _pins } \
+ .pins = (const struct hda_pintbl[]) { _pins, {0, 0}} \
}
#else

@@ -433,7 +433,7 @@ struct snd_hda_pin_quirk {
{ .codec = _codec,\
.subvendor = _subvendor,\
.value = _value,\
- .pins = (const struct hda_pintbl[]) { _pins } \
+ .pins = (const struct hda_pintbl[]) { _pins, {0, 0}} \
}

#endif

2014-10-28 03:39:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 079/146] clk: qcom: Add IPQ8064 PLL required for USB

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Gross <[email protected]>

commit dc1b3f657f25798b2dc9ed8928b80eb3183019a2 upstream.

This patch adds the PLL0 that is required for the USB clocks to
work properly.

Signed-off-by: Andy Gross <[email protected]>
Fixes: 24d8fba44af3 "clk: qcom: Add support for IPQ8064's global clock controller (GCC)"
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clk/qcom/gcc-ipq806x.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)

--- a/drivers/clk/qcom/gcc-ipq806x.c
+++ b/drivers/clk/qcom/gcc-ipq806x.c
@@ -32,6 +32,33 @@
#include "clk-branch.h"
#include "reset.h"

+static struct clk_pll pll0 = {
+ .l_reg = 0x30c4,
+ .m_reg = 0x30c8,
+ .n_reg = 0x30cc,
+ .config_reg = 0x30d4,
+ .mode_reg = 0x30c0,
+ .status_reg = 0x30d8,
+ .status_bit = 16,
+ .clkr.hw.init = &(struct clk_init_data){
+ .name = "pll0",
+ .parent_names = (const char *[]){ "pxo" },
+ .num_parents = 1,
+ .ops = &clk_pll_ops,
+ },
+};
+
+static struct clk_regmap pll0_vote = {
+ .enable_reg = 0x34c0,
+ .enable_mask = BIT(0),
+ .hw.init = &(struct clk_init_data){
+ .name = "pll0_vote",
+ .parent_names = (const char *[]){ "pll0" },
+ .num_parents = 1,
+ .ops = &clk_pll_vote_ops,
+ },
+};
+
static struct clk_pll pll3 = {
.l_reg = 0x3164,
.m_reg = 0x3168,
@@ -154,7 +181,7 @@ static const u8 gcc_pxo_pll8_pll0[] = {
static const char *gcc_pxo_pll8_pll0_map[] = {
"pxo",
"pll8_vote",
- "pll0",
+ "pll0_vote",
};

static struct freq_tbl clk_tbl_gsbi_uart[] = {
@@ -2133,6 +2160,8 @@ static struct clk_branch usb_fs1_h_clk =
};

static struct clk_regmap *gcc_ipq806x_clks[] = {
+ [PLL0] = &pll0.clkr,
+ [PLL0_VOTE] = &pll0_vote,
[PLL3] = &pll3.clkr,
[PLL8] = &pll8.clkr,
[PLL8_VOTE] = &pll8_vote,

2014-10-28 03:39:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 070/146] arm64: compat: fix compat types affecting struct compat_elf_prpsinfo

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Victor Kamensky <[email protected]>

commit 971a5b6fe634bb7b617d8c5f25b6a3ddbc600194 upstream.

The compat_elf_prpsinfo structure does not match the arch/arm struct
elf_pspsinfo definition. As result NT_PRPSINFO note in core file
created by arm64 kernel for aarch32 (compat) process has wrong size.
So gdb cannot display command that caused process crash.

Fix is to change size of __compat_uid_t, __compat_gid_t so it would
match size of similar fields in arch/arm case.

Signed-off-by: Victor Kamensky <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/include/asm/compat.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -37,8 +37,8 @@ typedef s32 compat_ssize_t;
typedef s32 compat_time_t;
typedef s32 compat_clock_t;
typedef s32 compat_pid_t;
-typedef u32 __compat_uid_t;
-typedef u32 __compat_gid_t;
+typedef u16 __compat_uid_t;
+typedef u16 __compat_gid_t;
typedef u16 __compat_uid16_t;
typedef u16 __compat_gid16_t;
typedef u32 __compat_uid32_t;

2014-10-28 03:39:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 093/146] PCI: mvebu: Fix uninitialized variable in mvebu_get_tgt_attr()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Petazzoni <[email protected]>

commit 56fab6e189441d714a2bfc8a64f3df9c0749dff7 upstream.

Geert Uytterhoeven reported a warning when building pci-mvebu:

drivers/pci/host/pci-mvebu.c: In function 'mvebu_get_tgt_attr':
drivers/pci/host/pci-mvebu.c:887:39: warning: 'rtype' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (slot == PCI_SLOT(devfn) && type == rtype) {
^

And indeed, the code of mvebu_get_tgt_attr() may lead to the usage of rtype
when being uninitialized, even though it would only happen if we had
entries other than I/O space and 32 bits memory space.

This commit fixes that by simply skipping the current DT range being
considered, if it doesn't match the resource type we're looking for.

Reported-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/pci/host/pci-mvebu.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/pci/host/pci-mvebu.c
+++ b/drivers/pci/host/pci-mvebu.c
@@ -873,7 +873,7 @@ static int mvebu_get_tgt_attr(struct dev
rangesz = pna + na + ns;
nranges = rlen / sizeof(__be32) / rangesz;

- for (i = 0; i < nranges; i++) {
+ for (i = 0; i < nranges; i++, range += rangesz) {
u32 flags = of_read_number(range, 1);
u32 slot = of_read_number(range + 1, 1);
u64 cpuaddr = of_read_number(range + na, pna);
@@ -883,14 +883,14 @@ static int mvebu_get_tgt_attr(struct dev
rtype = IORESOURCE_IO;
else if (DT_FLAGS_TO_TYPE(flags) == DT_TYPE_MEM32)
rtype = IORESOURCE_MEM;
+ else
+ continue;

if (slot == PCI_SLOT(devfn) && type == rtype) {
*tgt = DT_CPUADDR_TO_TARGET(cpuaddr);
*attr = DT_CPUADDR_TO_ATTR(cpuaddr);
return 0;
}
-
- range += rangesz;
}

return -ENOENT;

2014-10-28 03:39:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 094/146] PCI: Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yinghai Lu <[email protected]>

commit d61b0e87d2dfba3706dbbd6c7c6fd41c3d845685 upstream.

In 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to
64-bit resources"), we added IORESOURCE_MEM_64 to the mask in
pci_assign_unassigned_root_bus_resources(), but not to the mask in
pci_assign_unassigned_bridge_resources().

Add IORESOURCE_MEM_64 to the pci_assign_unassigned_bridge_resources() type
mask.

Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/pci/setup-bus.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1652,7 +1652,7 @@ void pci_assign_unassigned_bridge_resour
struct pci_dev_resource *fail_res;
int retval;
unsigned long type_mask = IORESOURCE_IO | IORESOURCE_MEM |
- IORESOURCE_PREFETCH;
+ IORESOURCE_PREFETCH | IORESOURCE_MEM_64;

again:
__pci_bus_size_bridges(parent, &add_list);

2014-10-28 03:39:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 098/146] Revert "ath9k_hw: reduce ANI firstep range for older chips"

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Felix Fietkau <[email protected]>

commit 171cdab8c78bb169d9693d587e1d02d2dd5a0274 upstream.

This reverts commit 09efc56345be4146ab9fc87a55c837ed5d6ea1ab

I've received reports that this change is decreasing throughput in some
rare conditions on an AR9280 based device

Signed-off-by: Felix Fietkau <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/ath/ath9k/ar5008_phy.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/net/wireless/ath/ath9k/ar5008_phy.c
+++ b/drivers/net/wireless/ath/ath9k/ar5008_phy.c
@@ -1004,9 +1004,11 @@ static bool ar5008_hw_ani_control_new(st
case ATH9K_ANI_FIRSTEP_LEVEL:{
u32 level = param;

- value = level;
+ value = level * 2;
REG_RMW_FIELD(ah, AR_PHY_FIND_SIG,
AR_PHY_FIND_SIG_FIRSTEP, value);
+ REG_RMW_FIELD(ah, AR_PHY_FIND_SIG_LOW,
+ AR_PHY_FIND_SIG_FIRSTEP_LOW, value);

if (level != aniState->firstepLevel) {
ath_dbg(common, ANI,

2014-10-28 03:39:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 081/146] ARM: at91: fix at91sam9263ek DT mmc pinmuxing settings

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andreas Henriksson <[email protected]>

commit b65e0fb3d046cc65d0a3c45d43de351fb363271b upstream.

As discovered on a custom board similar to at91sam9263ek and basing
its devicetree on that one apparently the pin muxing doesn't get
set up properly. This was discovered since the custom boards u-boot
does funky stuff with the pin muxing and leaved it set to SPI
which made the MMC driver not work under Linux.
The fix is simply to define the given configuration as the default.
This probably worked by pure luck before, but it's better to
make the muxing explicitly set.

Signed-off-by: Andreas Henriksson <[email protected]>
Acked-by: Boris Brezillon <[email protected]>
Signed-off-by: Nicolas Ferre <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/at91sam9263.dtsi | 2 ++
1 file changed, 2 insertions(+)

--- a/arch/arm/boot/dts/at91sam9263.dtsi
+++ b/arch/arm/boot/dts/at91sam9263.dtsi
@@ -834,6 +834,7 @@
compatible = "atmel,hsmci";
reg = <0xfff80000 0x600>;
interrupts = <10 IRQ_TYPE_LEVEL_HIGH 0>;
+ pinctrl-names = "default";
#address-cells = <1>;
#size-cells = <0>;
clocks = <&mci0_clk>;
@@ -845,6 +846,7 @@
compatible = "atmel,hsmci";
reg = <0xfff84000 0x600>;
interrupts = <11 IRQ_TYPE_LEVEL_HIGH 0>;
+ pinctrl-names = "default";
#address-cells = <1>;
#size-cells = <0>;
clocks = <&mci1_clk>;

2014-10-28 03:39:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 100/146] Bluetooth: Fix incorrect LE CoC PDU length restriction based on HCI MTU

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Johan Hedberg <[email protected]>

commit 72c6fb915ff2d30ae14053edee4f0d30019bad76 upstream.

The l2cap_create_le_flowctl_pdu() function that l2cap_segment_le_sdu()
calls is perfectly capable of doing packet fragmentation if given bigger
PDUs than the HCI buffers allow. Forcing the PDU length based on the HCI
MTU (conn->mtu) would therefore needlessly strict operation on hardware
with limited LE buffers (e.g. both Intel and Broadcom seem to have this
set to just 27 bytes).

This patch removes the restriction and makes it possible to send PDUs of
the full length that the remote MPS value allows.

Signed-off-by: Johan Hedberg <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/bluetooth/l2cap_core.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -2418,12 +2418,8 @@ static int l2cap_segment_le_sdu(struct l

BT_DBG("chan %p, msg %p, len %zu", chan, msg, len);

- pdu_len = chan->conn->mtu - L2CAP_HDR_SIZE;
-
- pdu_len = min_t(size_t, pdu_len, chan->remote_mps);
-
sdu_len = len;
- pdu_len -= L2CAP_SDULEN_SIZE;
+ pdu_len = chan->remote_mps - L2CAP_SDULEN_SIZE;

while (len > 0) {
if (len <= pdu_len)

2014-10-28 03:39:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 102/146] Bluetooth: Fix setting correct security level when initiating SMP

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Johan Hedberg <[email protected]>

commit 5eb596f55cacc2389554a8d7572d90d5e9d4269d upstream.

We can only determine the final security level when both pairing request
and response have been exchanged. When initiating pairing the starting
target security level is set to MEDIUM unless explicitly specified to be
HIGH, so that we can still perform pairing even if the remote doesn't
have MITM capabilities. However, once we've received the pairing
response we should re-consult the remote and local IO capabilities and
upgrade the target security level if necessary.

Without this patch the resulting Long Term Key will occasionally be
reported to be unauthenticated when it in reality is an authenticated
one.

Signed-off-by: Johan Hedberg <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/bluetooth/smp.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -442,8 +442,11 @@ static int tk_request(struct l2cap_conn
}

/* Not Just Works/Confirm results in MITM Authentication */
- if (method != JUST_CFM)
+ if (method != JUST_CFM) {
set_bit(SMP_FLAG_MITM_AUTH, &smp->flags);
+ if (hcon->pending_sec_level < BT_SECURITY_HIGH)
+ hcon->pending_sec_level = BT_SECURITY_HIGH;
+ }

/* If both devices have Keyoard-Display I/O, the master
* Confirms and the slave Enters the passkey.

2014-10-28 03:40:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 085/146] ARM: mvebu: Netgear RN2120: Use Hardware BCH ECC

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnaud Ebalard <[email protected]>

commit 500abb6ccb9e3f8d638a7f422443a8549245ef90 upstream.

The bootloader on the Netgear ReadyNAS RN2120 uses Hardware BCH
ECC (strength = 4), while the pxa3xx NAND driver by default uses
Hamming ECC (strength = 1).

This patch changes the ECC mode on these machines to match that
of the bootloader and of the stock firmware. That way, it is
now possible to update the kernel from userland (e.g. using
standard tools from mtd-utils package); u-boot will happily
load and boot it.

The issue was initially reported and fixed by Ben Pedell for
RN102. The RN2120 shares the same Hynix H27U1G8F2BTR NAND
flash and setup. This patch is based on Ben's fix for RN102.

Fixes: ad51eddd95ad ("ARM: mvebu: Enable NAND controller in ReadyNAS 2120 .dts file")
Signed-off-by: Arnaud Ebalard <[email protected]>
Link: https://lkml.kernel.org/r/61f6a1b7ad0adc57a0e201b9680bc2e5f214a317.1410035142.git.arno@natisbad.org
Signed-off-by: Jason Cooper <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/armada-xp-netgear-rn2120.dts | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/arm/boot/dts/armada-xp-netgear-rn2120.dts
+++ b/arch/arm/boot/dts/armada-xp-netgear-rn2120.dts
@@ -223,6 +223,10 @@
marvell,nand-enable-arbiter;
nand-on-flash-bbt;

+ /* Use Hardware BCH ECC */
+ nand-ecc-strength = <4>;
+ nand-ecc-step-size = <512>;
+
partition@0 {
label = "u-boot";
reg = <0x0000000 0x180000>; /* 1.5MB */

2014-10-28 03:40:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 088/146] ecryptfs: avoid to access NULL pointer when write metadata in xattr

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Chao Yu <[email protected]>

commit 35425ea2492175fd39f6116481fe98b2b3ddd4ca upstream.

Christopher Head 2014-06-28 05:26:20 UTC described:
"I tried to reproduce this on 3.12.21. Instead, when I do "echo hello > foo"
in an ecryptfs mount with ecryptfs_xattr specified, I get a kernel crash:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61
PGD d7840067 PUD b2c3c067 PMD 0
Oops: 0002 [#1] SMP
Modules linked in: nvidia(PO)
CPU: 3 PID: 3566 Comm: bash Tainted: P O 3.12.21-gentoo-r1 #2
Hardware name: ASUSTek Computer Inc. G60JX/G60JX, BIOS 206 03/15/2010
task: ffff8801948944c0 ti: ffff8800bad70000 task.ti: ffff8800bad70000
RIP: 0010:[<ffffffff8110eb39>] [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61
RSP: 0018:ffff8800bad71c10 EFLAGS: 00010246
RAX: 00000000000181a4 RBX: ffff880198648480 RCX: 0000000000000000
RDX: 0000000000000004 RSI: ffff880172010450 RDI: 0000000000000000
RBP: ffff880198490e40 R08: 0000000000000000 R09: 0000000000000000
R10: ffff880172010450 R11: ffffea0002c51e80 R12: 0000000000002000
R13: 000000000000001a R14: 0000000000000000 R15: ffff880198490e40
FS: 00007ff224caa700(0000) GS:ffff88019fcc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000bb07f000 CR4: 00000000000007e0
Stack:
ffffffff811826e8 ffff8800a39d8000 0000000000000000 000000000000001a
ffff8800a01d0000 ffff8800a39d8000 ffffffff81185fd5 ffffffff81082c2c
00000001a39d8000 53d0abbc98490e40 0000000000000037 ffff8800a39d8220
Call Trace:
[<ffffffff811826e8>] ? ecryptfs_setxattr+0x40/0x52
[<ffffffff81185fd5>] ? ecryptfs_write_metadata+0x1b3/0x223
[<ffffffff81082c2c>] ? should_resched+0x5/0x23
[<ffffffff8118322b>] ? ecryptfs_initialize_file+0xaf/0xd4
[<ffffffff81183344>] ? ecryptfs_create+0xf4/0x142
[<ffffffff810f8c0d>] ? vfs_create+0x48/0x71
[<ffffffff810f9c86>] ? do_last.isra.68+0x559/0x952
[<ffffffff810f7ce7>] ? link_path_walk+0xbd/0x458
[<ffffffff810fa2a3>] ? path_openat+0x224/0x472
[<ffffffff810fa7bd>] ? do_filp_open+0x2b/0x6f
[<ffffffff81103606>] ? __alloc_fd+0xd6/0xe7
[<ffffffff810ee6ab>] ? do_sys_open+0x65/0xe9
[<ffffffff8157d022>] ? system_call_fastpath+0x16/0x1b
RIP [<ffffffff8110eb39>] fsstack_copy_attr_all+0x2/0x61
RSP <ffff8800bad71c10>
CR2: 0000000000000000
---[ end trace df9dba5f1ddb8565 ]---"

If we create a file when we mount with ecryptfs_xattr_metadata option, we will
encounter a crash in this path:
->ecryptfs_create
->ecryptfs_initialize_file
->ecryptfs_write_metadata
->ecryptfs_write_metadata_to_xattr
->ecryptfs_setxattr
->fsstack_copy_attr_all
It's because our dentry->d_inode used in fsstack_copy_attr_all is NULL, and it
will be initialized when ecryptfs_initialize_file finish.

So we should skip copying attr from lower inode when the value of ->d_inode is
invalid.

Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Tyler Hicks <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ecryptfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -1039,7 +1039,7 @@ ecryptfs_setxattr(struct dentry *dentry,
}

rc = vfs_setxattr(lower_dentry, name, value, size, flags);
- if (!rc)
+ if (!rc && dentry->d_inode)
fsstack_copy_attr_all(dentry->d_inode, lower_dentry->d_inode);
out:
return rc;

2014-10-28 03:40:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 083/146] ARM: Kirkwood: Fix DT based DSA.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Lunn <[email protected]>

commit 4f5e01e96d424b54f5f0e89ee1ba9ccca03a3941 upstream.

During the conversion of boards to use DT to instantiate Distributed
Switch Architecture, nobody volunteered to test. As to be expected,
the conversion was flawed. Testers and access to hardware has now
become available, and this patch hopefully fixes the problems.

dsa,mii-bus must be a phandle to the top level mdio node, not the port
specific subnode of the mdio device.

dsa,ethernet must be a phandle to the port subnode within the ethernet
DT node, not the ethernet node.

Don't pinctrl hog the card detect gpio for mvsdio.

Rename the .dts files to make it clearer which file is for the Z0
stepping and which for the A0 or later stepping.

Signed-off-by: Andrew Lunn <[email protected]>
Cc: [email protected]
Tested-by: Eugene Sanivsky <[email protected]>
Fixes: e2eaa339af44: ("ARM: Kirkwood: convert rd88f6281-setup.c to DT.")
Fixes: e7c8f3808be8: ("ARM: kirkwood: Convert mv88f6281gtw_ge switch setup to DT")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jason Cooper <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/Makefile | 4 +-
arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts | 16 +++------
arch/arm/boot/dts/kirkwood-rd88f6281-a.dts | 43 +++++++++++++++++++++++++
arch/arm/boot/dts/kirkwood-rd88f6281-a0.dts | 26 ---------------
arch/arm/boot/dts/kirkwood-rd88f6281-a1.dts | 31 ------------------
arch/arm/boot/dts/kirkwood-rd88f6281-z0.dts | 35 ++++++++++++++++++++
arch/arm/boot/dts/kirkwood-rd88f6281.dtsi | 27 ++-------------
arch/arm/boot/dts/kirkwood.dtsi | 4 +-
8 files changed, 93 insertions(+), 93 deletions(-)

--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -144,8 +144,8 @@ dtb-$(CONFIG_MACH_KIRKWOOD) += kirkwood-
kirkwood-openrd-client.dtb \
kirkwood-openrd-ultimate.dtb \
kirkwood-rd88f6192.dtb \
- kirkwood-rd88f6281-a0.dtb \
- kirkwood-rd88f6281-a1.dtb \
+ kirkwood-rd88f6281-z0.dtb \
+ kirkwood-rd88f6281-a.dtb \
kirkwood-rs212.dtb \
kirkwood-rs409.dtb \
kirkwood-rs411.dtb \
--- a/arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts
+++ b/arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts
@@ -123,11 +123,11 @@

dsa@0 {
compatible = "marvell,dsa";
- #address-cells = <2>;
+ #address-cells = <1>;
#size-cells = <0>;

- dsa,ethernet = <&eth0>;
- dsa,mii-bus = <&ethphy0>;
+ dsa,ethernet = <&eth0port>;
+ dsa,mii-bus = <&mdio>;

switch@0 {
#address-cells = <1>;
@@ -169,17 +169,13 @@

&mdio {
status = "okay";
-
- ethphy0: ethernet-phy@ff {
- reg = <0xff>; /* No phy attached */
- speed = <1000>;
- duplex = <1>;
- };
};

&eth0 {
status = "okay";
+
ethernet0-port@0 {
- phy-handle = <&ethphy0>;
+ speed = <1000>;
+ duplex = <1>;
};
};
--- /dev/null
+++ b/arch/arm/boot/dts/kirkwood-rd88f6281-a.dts
@@ -0,0 +1,43 @@
+/*
+ * Marvell RD88F6181 A Board descrition
+ *
+ * Andrew Lunn <[email protected]>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ *
+ * This file contains the definitions for the board with the A0 or
+ * higher stepping of the SoC. The ethernet switch does not have a
+ * "wan" port.
+ */
+
+/dts-v1/;
+#include "kirkwood-rd88f6281.dtsi"
+
+/ {
+ model = "Marvell RD88f6281 Reference design, with A0 or higher SoC";
+ compatible = "marvell,rd88f6281-a", "marvell,rd88f6281","marvell,kirkwood-88f6281", "marvell,kirkwood";
+
+ dsa@0 {
+ switch@0 {
+ reg = <10 0>; /* MDIO address 10, switch 0 in tree */
+ };
+ };
+};
+
+&mdio {
+ status = "okay";
+
+ ethphy1: ethernet-phy@11 {
+ reg = <11>;
+ };
+};
+
+&eth1 {
+ status = "okay";
+
+ ethernet1-port@0 {
+ phy-handle = <&ethphy1>;
+ };
+};
--- a/arch/arm/boot/dts/kirkwood-rd88f6281-a0.dts
+++ /dev/null
@@ -1,26 +0,0 @@
-/*
- * Marvell RD88F6181 A0 Board descrition
- *
- * Andrew Lunn <[email protected]>
- *
- * This file is licensed under the terms of the GNU General Public
- * License version 2. This program is licensed "as is" without any
- * warranty of any kind, whether express or implied.
- *
- * This file contains the definitions for the board with the A0 variant of
- * the SoC. The ethernet switch does not have a "wan" port.
- */
-
-/dts-v1/;
-#include "kirkwood-rd88f6281.dtsi"
-
-/ {
- model = "Marvell RD88f6281 Reference design, with A0 SoC";
- compatible = "marvell,rd88f6281-a0", "marvell,rd88f6281","marvell,kirkwood-88f6281", "marvell,kirkwood";
-
- dsa@0 {
- switch@0 {
- reg = <10 0>; /* MDIO address 10, switch 0 in tree */
- };
- };
-};
\ No newline at end of file
--- a/arch/arm/boot/dts/kirkwood-rd88f6281-a1.dts
+++ /dev/null
@@ -1,31 +0,0 @@
-/*
- * Marvell RD88F6181 A1 Board descrition
- *
- * Andrew Lunn <[email protected]>
- *
- * This file is licensed under the terms of the GNU General Public
- * License version 2. This program is licensed "as is" without any
- * warranty of any kind, whether express or implied.
- *
- * This file contains the definitions for the board with the A1 variant of
- * the SoC. The ethernet switch has a "wan" port.
- */
-
-/dts-v1/;
-
-#include "kirkwood-rd88f6281.dtsi"
-
-/ {
- model = "Marvell RD88f6281 Reference design, with A1 SoC";
- compatible = "marvell,rd88f6281-a1", "marvell,rd88f6281","marvell,kirkwood-88f6281", "marvell,kirkwood";
-
- dsa@0 {
- switch@0 {
- reg = <0 0>; /* MDIO address 0, switch 0 in tree */
- port@4 {
- reg = <4>;
- label = "wan";
- };
- };
- };
-};
\ No newline at end of file
--- /dev/null
+++ b/arch/arm/boot/dts/kirkwood-rd88f6281-z0.dts
@@ -0,0 +1,35 @@
+/*
+ * Marvell RD88F6181 Z0 stepping descrition
+ *
+ * Andrew Lunn <[email protected]>
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ *
+ * This file contains the definitions for the board using the Z0
+ * stepping of the SoC. The ethernet switch has a "wan" port.
+*/
+
+/dts-v1/;
+
+#include "kirkwood-rd88f6281.dtsi"
+
+/ {
+ model = "Marvell RD88f6281 Reference design, with Z0 SoC";
+ compatible = "marvell,rd88f6281-z0", "marvell,rd88f6281","marvell,kirkwood-88f6281", "marvell,kirkwood";
+
+ dsa@0 {
+ switch@0 {
+ reg = <0 0>; /* MDIO address 0, switch 0 in tree */
+ port@4 {
+ reg = <4>;
+ label = "wan";
+ };
+ };
+ };
+};
+
+&eth1 {
+ status = "disabled";
+};
--- a/arch/arm/boot/dts/kirkwood-rd88f6281.dtsi
+++ b/arch/arm/boot/dts/kirkwood-rd88f6281.dtsi
@@ -37,7 +37,6 @@

ocp@f1000000 {
pinctrl: pin-controller@10000 {
- pinctrl-0 = <&pmx_sdio_cd>;
pinctrl-names = "default";

pmx_sdio_cd: pmx-sdio-cd {
@@ -69,8 +68,8 @@
#address-cells = <2>;
#size-cells = <0>;

- dsa,ethernet = <&eth0>;
- dsa,mii-bus = <&ethphy1>;
+ dsa,ethernet = <&eth0port>;
+ dsa,mii-bus = <&mdio>;

switch@0 {
#address-cells = <1>;
@@ -119,35 +118,19 @@
};

partition@300000 {
- label = "data";
+ label = "rootfs";
reg = <0x0300000 0x500000>;
};
};

&mdio {
status = "okay";
-
- ethphy0: ethernet-phy@0 {
- reg = <0>;
- };
-
- ethphy1: ethernet-phy@ff {
- reg = <0xff>; /* No PHY attached */
- speed = <1000>;
- duple = <1>;
- };
};

&eth0 {
status = "okay";
ethernet0-port@0 {
- phy-handle = <&ethphy0>;
- };
-};
-
-&eth1 {
- status = "okay";
- ethernet1-port@0 {
- phy-handle = <&ethphy1>;
+ speed = <1000>;
+ duplex = <1>;
};
};
--- a/arch/arm/boot/dts/kirkwood.dtsi
+++ b/arch/arm/boot/dts/kirkwood.dtsi
@@ -309,7 +309,7 @@
marvell,tx-checksum-limit = <1600>;
status = "disabled";

- ethernet0-port@0 {
+ eth0port: ethernet0-port@0 {
compatible = "marvell,kirkwood-eth-port";
reg = <0>;
interrupts = <11>;
@@ -342,7 +342,7 @@
pinctrl-names = "default";
status = "disabled";

- ethernet1-port@0 {
+ eth1port: ethernet1-port@0 {
compatible = "marvell,kirkwood-eth-port";
reg = <0>;
interrupts = <15>;

2014-10-28 03:40:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 114/146] powerpc: Only set numa node information for present cpus at boottime

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Li Zhong <[email protected]>

commit bc3c4327c92b9ceb9a6356ec64d1b2ab2dc851f9 upstream.

As Nish suggested, it makes more sense to init the numa node informatiion
for present cpus at boottime, which could also avoid WARN_ON(1) in
numa_setup_cpu().

With this change, we also need to change the smp_prepare_cpus() to set up
numa information only on present cpus.

For those possible, but not present cpus, their numa information
will be set up after they are started, as the original code did before commit
2fabf084b6ad.

Cc: Nishanth Aravamudan <[email protected]>
Cc: Nathan Fontenot <[email protected]>
Signed-off-by: Li Zhong <[email protected]>
Acked-by: Nishanth Aravamudan <[email protected]>
Tested-by: Cyril Bur <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/smp.c | 10 ++++++++--
arch/powerpc/mm/numa.c | 2 +-
2 files changed, 9 insertions(+), 3 deletions(-)

--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -379,8 +379,11 @@ void __init smp_prepare_cpus(unsigned in
/*
* numa_node_id() works after this.
*/
- set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
- set_cpu_numa_mem(cpu, local_memory_node(numa_cpu_lookup_table[cpu]));
+ if (cpu_present(cpu)) {
+ set_cpu_numa_node(cpu, numa_cpu_lookup_table[cpu]);
+ set_cpu_numa_mem(cpu,
+ local_memory_node(numa_cpu_lookup_table[cpu]));
+ }
}

cpumask_set_cpu(boot_cpuid, cpu_sibling_mask(boot_cpuid));
@@ -728,6 +731,9 @@ void start_secondary(void *unused)
}
traverse_core_siblings(cpu, true);

+ set_numa_node(numa_cpu_lookup_table[cpu]);
+ set_numa_mem(local_memory_node(numa_cpu_lookup_table[cpu]));
+
smp_wmb();
notify_cpu_starting(cpu);
set_cpu_online(cpu, true);
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1127,7 +1127,7 @@ void __init do_init_bootmem(void)
* even before we online them, so that we can use cpu_to_{node,mem}
* early in boot, cf. smp_prepare_cpus().
*/
- for_each_possible_cpu(cpu) {
+ for_each_present_cpu(cpu) {
numa_setup_cpu((unsigned long)cpu);
}
}

2014-10-28 03:40:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 119/146] ima: pass opened flag to identify newly created files

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Kasatkin <[email protected]>

commit 3034a146820c26fe6da66a45f6340fe87fe0983a upstream.

Empty files and missing xattrs do not guarantee that a file was
just created. This patch passes FILE_CREATED flag to IMA to
reliably identify new files.

Signed-off-by: Dmitry Kasatkin <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namei.c | 2 +-
fs/nfsd/vfs.c | 2 +-
include/linux/ima.h | 4 ++--
security/integrity/ima/ima.h | 4 ++--
security/integrity/ima/ima_appraise.c | 4 ++--
security/integrity/ima/ima_main.c | 16 ++++++++--------
6 files changed, 16 insertions(+), 16 deletions(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3074,7 +3074,7 @@ opened:
error = open_check_o_direct(file);
if (error)
goto exit_fput;
- error = ima_file_check(file, op->acc_mode);
+ error = ima_file_check(file, op->acc_mode, *opened);
if (error)
goto exit_fput;

--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -708,7 +708,7 @@ nfsd_open(struct svc_rqst *rqstp, struct
host_err = PTR_ERR(*filp);
*filp = NULL;
} else {
- host_err = ima_file_check(*filp, may_flags);
+ host_err = ima_file_check(*filp, may_flags, 0);

if (may_flags & NFSD_MAY_64BIT_COOKIE)
(*filp)->f_mode |= FMODE_64BITHASH;
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -15,7 +15,7 @@ struct linux_binprm;

#ifdef CONFIG_IMA
extern int ima_bprm_check(struct linux_binprm *bprm);
-extern int ima_file_check(struct file *file, int mask);
+extern int ima_file_check(struct file *file, int mask, int opened);
extern void ima_file_free(struct file *file);
extern int ima_file_mmap(struct file *file, unsigned long prot);
extern int ima_module_check(struct file *file);
@@ -27,7 +27,7 @@ static inline int ima_bprm_check(struct
return 0;
}

-static inline int ima_file_check(struct file *file, int mask)
+static inline int ima_file_check(struct file *file, int mask, int opened)
{
return 0;
}
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -177,7 +177,7 @@ void ima_delete_rules(void);
int ima_appraise_measurement(int func, struct integrity_iint_cache *iint,
struct file *file, const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len);
+ int xattr_len, int opened);
int ima_must_appraise(struct inode *inode, int mask, enum ima_hooks func);
void ima_update_xattr(struct integrity_iint_cache *iint, struct file *file);
enum integrity_status ima_get_cache_status(struct integrity_iint_cache *iint,
@@ -193,7 +193,7 @@ static inline int ima_appraise_measureme
struct file *file,
const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len)
+ int xattr_len, int opened)
{
return INTEGRITY_UNKNOWN;
}
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -183,7 +183,7 @@ int ima_read_xattr(struct dentry *dentry
int ima_appraise_measurement(int func, struct integrity_iint_cache *iint,
struct file *file, const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len)
+ int xattr_len, int opened)
{
static const char op[] = "appraise_data";
char *cause = "unknown";
@@ -203,7 +203,7 @@ int ima_appraise_measurement(int func, s

cause = "missing-hash";
status = INTEGRITY_NOLABEL;
- if (inode->i_size == 0) {
+ if (opened & FILE_CREATED) {
iint->flags |= IMA_NEW_FILE;
status = INTEGRITY_PASS;
}
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -157,7 +157,7 @@ void ima_file_free(struct file *file)
}

static int process_measurement(struct file *file, const char *filename,
- int mask, int function)
+ int mask, int function, int opened)
{
struct inode *inode = file_inode(file);
struct integrity_iint_cache *iint;
@@ -226,7 +226,7 @@ static int process_measurement(struct fi
xattr_value, xattr_len);
if (action & IMA_APPRAISE_SUBMASK)
rc = ima_appraise_measurement(_func, iint, file, pathname,
- xattr_value, xattr_len);
+ xattr_value, xattr_len, opened);
if (action & IMA_AUDIT)
ima_audit_measurement(iint, pathname);
kfree(pathbuf);
@@ -255,7 +255,7 @@ out:
int ima_file_mmap(struct file *file, unsigned long prot)
{
if (file && (prot & PROT_EXEC))
- return process_measurement(file, NULL, MAY_EXEC, MMAP_CHECK);
+ return process_measurement(file, NULL, MAY_EXEC, MMAP_CHECK, 0);
return 0;
}

@@ -277,7 +277,7 @@ int ima_bprm_check(struct linux_binprm *
return process_measurement(bprm->file,
(strcmp(bprm->filename, bprm->interp) == 0) ?
bprm->filename : bprm->interp,
- MAY_EXEC, BPRM_CHECK);
+ MAY_EXEC, BPRM_CHECK, 0);
}

/**
@@ -290,12 +290,12 @@ int ima_bprm_check(struct linux_binprm *
* On success return 0. On integrity appraisal error, assuming the file
* is in policy and IMA-appraisal is in enforcing mode, return -EACCES.
*/
-int ima_file_check(struct file *file, int mask)
+int ima_file_check(struct file *file, int mask, int opened)
{
ima_rdwr_violation_check(file);
return process_measurement(file, NULL,
mask & (MAY_READ | MAY_WRITE | MAY_EXEC),
- FILE_CHECK);
+ FILE_CHECK, opened);
}
EXPORT_SYMBOL_GPL(ima_file_check);

@@ -318,7 +318,7 @@ int ima_module_check(struct file *file)
#endif
return 0; /* We rely on module signature checking */
}
- return process_measurement(file, NULL, MAY_EXEC, MODULE_CHECK);
+ return process_measurement(file, NULL, MAY_EXEC, MODULE_CHECK, 0);
}

int ima_fw_from_file(struct file *file, char *buf, size_t size)
@@ -329,7 +329,7 @@ int ima_fw_from_file(struct file *file,
return -EACCES; /* INTEGRITY_UNKNOWN */
return 0;
}
- return process_measurement(file, NULL, MAY_EXEC, FIRMWARE_CHECK);
+ return process_measurement(file, NULL, MAY_EXEC, FIRMWARE_CHECK, 0);
}

static int __init init_ima(void)

2014-10-28 03:40:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 120/146] sparc32: dma_alloc_coherent must honour gfp flags

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Hellstrom <[email protected]>

[ Upstream commit d1105287aabe88dbb3af825140badaa05cf0442c ]

dma_zalloc_coherent() calls dma_alloc_coherent(__GFP_ZERO)
but the sparc32 implementations sbus_alloc_coherent() and
pci32_alloc_coherent() doesn't take the gfp flags into
account.

Tested on the SPARC32/LEON GRETH Ethernet driver which fails
due to dma_alloc_coherent(__GFP_ZERO) returns non zeroed
pages.

Signed-off-by: Daniel Hellstrom <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/ioport.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -278,7 +278,8 @@ static void *sbus_alloc_coherent(struct
}

order = get_order(len_total);
- if ((va = __get_free_pages(GFP_KERNEL|__GFP_COMP, order)) == 0)
+ va = __get_free_pages(gfp, order);
+ if (va == 0)
goto err_nopages;

if ((res = kzalloc(sizeof(struct resource), GFP_KERNEL)) == NULL)
@@ -443,7 +444,7 @@ static void *pci32_alloc_coherent(struct
}

order = get_order(len_total);
- va = (void *) __get_free_pages(GFP_KERNEL, order);
+ va = (void *) __get_free_pages(gfp, order);
if (va == NULL) {
printk("pci_alloc_consistent: no %ld pages\n", len_total>>PAGE_SHIFT);
goto err_nopages;

2014-10-28 03:40:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 124/146] sparc64: Move request_irq() from ldc_bind() to ldc_alloc()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sowmini Varadhan <[email protected]>

[ Upstream commit c21c4ab0d6921f7160a43216fa6973b5924de561 ]

The request_irq() needs to be done from ldc_alloc()
to avoid the following (caught by lockdep)

[00000000004a0738] __might_sleep+0xf8/0x120
[000000000058bea4] kmem_cache_alloc_trace+0x184/0x2c0
[00000000004faf80] request_threaded_irq+0x80/0x160
[000000000044f71c] ldc_bind+0x7c/0x220
[0000000000452454] vio_port_up+0x54/0xe0
[00000000101f6778] probe_disk+0x38/0x220 [sunvdc]
[00000000101f6b8c] vdc_port_probe+0x22c/0x300 [sunvdc]
[0000000000451a88] vio_device_probe+0x48/0x60
[000000000074c56c] really_probe+0x6c/0x300
[000000000074c83c] driver_probe_device+0x3c/0xa0
[000000000074c92c] __driver_attach+0x8c/0xa0
[000000000074a6ec] bus_for_each_dev+0x6c/0xa0
[000000000074c1dc] driver_attach+0x1c/0x40
[000000000074b0fc] bus_add_driver+0xbc/0x280

Signed-off-by: Sowmini Varadhan <[email protected]>
Acked-by: Dwight Engen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/ldc.h | 5 +++--
arch/sparc/kernel/ds.c | 4 ++--
arch/sparc/kernel/ldc.c | 41 +++++++++++++++++++++--------------------
arch/sparc/kernel/viohs.c | 4 ++--
4 files changed, 28 insertions(+), 26 deletions(-)

--- a/arch/sparc/include/asm/ldc.h
+++ b/arch/sparc/include/asm/ldc.h
@@ -53,13 +53,14 @@ struct ldc_channel;
/* Allocate state for a channel. */
struct ldc_channel *ldc_alloc(unsigned long id,
const struct ldc_channel_config *cfgp,
- void *event_arg);
+ void *event_arg,
+ const char *name);

/* Shut down and free state for a channel. */
void ldc_free(struct ldc_channel *lp);

/* Register TX and RX queues of the link with the hypervisor. */
-int ldc_bind(struct ldc_channel *lp, const char *name);
+int ldc_bind(struct ldc_channel *lp);

/* For non-RAW protocols we need to complete a handshake before
* communication can proceed. ldc_connect() does that, if the
--- a/arch/sparc/kernel/ds.c
+++ b/arch/sparc/kernel/ds.c
@@ -1200,14 +1200,14 @@ static int ds_probe(struct vio_dev *vdev
ds_cfg.tx_irq = vdev->tx_irq;
ds_cfg.rx_irq = vdev->rx_irq;

- lp = ldc_alloc(vdev->channel_id, &ds_cfg, dp);
+ lp = ldc_alloc(vdev->channel_id, &ds_cfg, dp, "DS");
if (IS_ERR(lp)) {
err = PTR_ERR(lp);
goto out_free_ds_states;
}
dp->lp = lp;

- err = ldc_bind(lp, "DS");
+ err = ldc_bind(lp);
if (err)
goto out_free_ldc;

--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -1078,7 +1078,8 @@ static void ldc_iommu_release(struct ldc

struct ldc_channel *ldc_alloc(unsigned long id,
const struct ldc_channel_config *cfgp,
- void *event_arg)
+ void *event_arg,
+ const char *name)
{
struct ldc_channel *lp;
const struct ldc_mode_ops *mops;
@@ -1093,6 +1094,8 @@ struct ldc_channel *ldc_alloc(unsigned l
err = -EINVAL;
if (!cfgp)
goto out_err;
+ if (!name)
+ goto out_err;

switch (cfgp->mode) {
case LDC_MODE_RAW:
@@ -1185,6 +1188,21 @@ struct ldc_channel *ldc_alloc(unsigned l

INIT_HLIST_HEAD(&lp->mh_list);

+ snprintf(lp->rx_irq_name, LDC_IRQ_NAME_MAX, "%s RX", name);
+ snprintf(lp->tx_irq_name, LDC_IRQ_NAME_MAX, "%s TX", name);
+
+ err = request_irq(lp->cfg.rx_irq, ldc_rx, 0,
+ lp->rx_irq_name, lp);
+ if (err)
+ goto out_free_txq;
+
+ err = request_irq(lp->cfg.tx_irq, ldc_tx, 0,
+ lp->tx_irq_name, lp);
+ if (err) {
+ free_irq(lp->cfg.rx_irq, lp);
+ goto out_free_txq;
+ }
+
return lp;

out_free_txq:
@@ -1237,31 +1255,14 @@ EXPORT_SYMBOL(ldc_free);
* state. This does not initiate a handshake, ldc_connect() does
* that.
*/
-int ldc_bind(struct ldc_channel *lp, const char *name)
+int ldc_bind(struct ldc_channel *lp)
{
unsigned long hv_err, flags;
int err = -EINVAL;

- if (!name ||
- (lp->state != LDC_STATE_INIT))
+ if (lp->state != LDC_STATE_INIT)
return -EINVAL;

- snprintf(lp->rx_irq_name, LDC_IRQ_NAME_MAX, "%s RX", name);
- snprintf(lp->tx_irq_name, LDC_IRQ_NAME_MAX, "%s TX", name);
-
- err = request_irq(lp->cfg.rx_irq, ldc_rx, 0,
- lp->rx_irq_name, lp);
- if (err)
- return err;
-
- err = request_irq(lp->cfg.tx_irq, ldc_tx, 0,
- lp->tx_irq_name, lp);
- if (err) {
- free_irq(lp->cfg.rx_irq, lp);
- return err;
- }
-
-
spin_lock_irqsave(&lp->lock, flags);

enable_irq(lp->cfg.rx_irq);
--- a/arch/sparc/kernel/viohs.c
+++ b/arch/sparc/kernel/viohs.c
@@ -714,7 +714,7 @@ int vio_ldc_alloc(struct vio_driver_stat
cfg.tx_irq = vio->vdev->tx_irq;
cfg.rx_irq = vio->vdev->rx_irq;

- lp = ldc_alloc(vio->vdev->channel_id, &cfg, event_arg);
+ lp = ldc_alloc(vio->vdev->channel_id, &cfg, event_arg, vio->name);
if (IS_ERR(lp))
return PTR_ERR(lp);

@@ -746,7 +746,7 @@ void vio_port_up(struct vio_driver_state

err = 0;
if (state == LDC_STATE_INIT) {
- err = ldc_bind(vio->lp, vio->name);
+ err = ldc_bind(vio->lp);
if (err)
printk(KERN_WARNING "%s: Port %lu bind failed, "
"err=%d\n",

2014-10-28 03:41:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 108/146] mm/cma: fix cma bitmap aligned mask computing

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Weijie Yang <[email protected]>

commit 68faed630fc151a7a1c4853df00fb3dcacf782b4 upstream.

The current cma bitmap aligned mask computation is incorrect. It could
cause an unexpected alignment when using cma_alloc() if the wanted align
order is larger than cma->order_per_bit.

Take kvm for example (PAGE_SHIFT = 12), kvm_cma->order_per_bit is set to
6. When kvm_alloc_rma() tries to alloc kvm_rma_pages, it will use 15 as
the expected align value. After using the current implementation however,
we get 0 as cma bitmap aligned mask other than 511.

This patch fixes the cma bitmap aligned mask calculation.

[[email protected]: coding-style fixes]
Signed-off-by: Weijie Yang <[email protected]>
Acked-by: Michal Nazarewicz <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/cma.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/cma.c
+++ b/mm/cma.c
@@ -57,7 +57,9 @@ unsigned long cma_get_size(struct cma *c

static unsigned long cma_bitmap_aligned_mask(struct cma *cma, int align_order)
{
- return (1UL << (align_order >> cma->order_per_bit)) - 1;
+ if (align_order <= cma->order_per_bit)
+ return 0;
+ return (1UL << (align_order - cma->order_per_bit)) - 1;
}

static unsigned long cma_bitmap_maxno(struct cma *cma)

2014-10-28 03:41:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 109/146] kernel: add support for gcc 5

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sasha Levin <[email protected]>

commit 71458cfc782eafe4b27656e078d379a34e472adf upstream.

We're missing include/linux/compiler-gcc5.h which is required now
because gcc branched off to v5 in trunk.

Just copy the relevant bits out of include/linux/compiler-gcc4.h,
no new code is added as of now.

This fixes a build error when using gcc 5.

Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/compiler-gcc5.h | 66 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)

--- /dev/null
+++ b/include/linux/compiler-gcc5.h
@@ -0,0 +1,66 @@
+#ifndef __LINUX_COMPILER_H
+#error "Please don't include <linux/compiler-gcc5.h> directly, include <linux/compiler.h> instead."
+#endif
+
+#define __used __attribute__((__used__))
+#define __must_check __attribute__((warn_unused_result))
+#define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
+
+/* Mark functions as cold. gcc will assume any path leading to a call
+ to them will be unlikely. This means a lot of manual unlikely()s
+ are unnecessary now for any paths leading to the usual suspects
+ like BUG(), printk(), panic() etc. [but let's keep them for now for
+ older compilers]
+
+ Early snapshots of gcc 4.3 don't support this and we can't detect this
+ in the preprocessor, but we can live with this because they're unreleased.
+ Maketime probing would be overkill here.
+
+ gcc also has a __attribute__((__hot__)) to move hot functions into
+ a special section, but I don't see any sense in this right now in
+ the kernel context */
+#define __cold __attribute__((__cold__))
+
+#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)
+
+#ifndef __CHECKER__
+# define __compiletime_warning(message) __attribute__((warning(message)))
+# define __compiletime_error(message) __attribute__((error(message)))
+#endif /* __CHECKER__ */
+
+/*
+ * Mark a position in code as unreachable. This can be used to
+ * suppress control flow warnings after asm blocks that transfer
+ * control elsewhere.
+ *
+ * Early snapshots of gcc 4.5 don't support this and we can't detect
+ * this in the preprocessor, but we can live with this because they're
+ * unreleased. Really, we need to have autoconf for the kernel.
+ */
+#define unreachable() __builtin_unreachable()
+
+/* Mark a function definition as prohibited from being cloned. */
+#define __noclone __attribute__((__noclone__))
+
+/*
+ * Tell the optimizer that something else uses this function or variable.
+ */
+#define __visible __attribute__((externally_visible))
+
+/*
+ * GCC 'asm goto' miscompiles certain code sequences:
+ *
+ * http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
+ *
+ * Work it around via a compiler barrier quirk suggested by Jakub Jelinek.
+ * Fixed in GCC 4.8.2 and later versions.
+ *
+ * (asm goto is automatically volatile - the naming reflects this.)
+ */
+#define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
+
+#ifdef CONFIG_ARCH_USE_BUILTIN_BSWAP
+#define __HAVE_BUILTIN_BSWAP32__
+#define __HAVE_BUILTIN_BSWAP64__
+#define __HAVE_BUILTIN_BSWAP16__
+#endif /* CONFIG_ARCH_USE_BUILTIN_BSWAP */

2014-10-28 03:41:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 111/146] mm/balloon_compaction: redesign ballooned pages management

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Konstantin Khlebnikov <[email protected]>

commit d6d86c0a7f8ddc5b38cf089222cb1d9540762dc2 upstream.

Sasha Levin reported KASAN splash inside isolate_migratepages_range().
Problem is in the function __is_movable_balloon_page() which tests
AS_BALLOON_MAP in page->mapping->flags. This function has no protection
against anonymous pages. As result it tried to check address space flags
inside struct anon_vma.

Further investigation shows more problems in current implementation:

* Special branch in __unmap_and_move() never works:
balloon_page_movable() checks page flags and page_count. In
__unmap_and_move() page is locked, reference counter is elevated, thus
balloon_page_movable() always fails. As a result execution goes to the
normal migration path. virtballoon_migratepage() returns
MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
move_to_new_page() thinks this is an error code and assigns
newpage->mapping to NULL. Newly migrated page lose connectivity with
balloon an all ability for further migration.

* lru_lock erroneously required in isolate_migratepages_range() for
isolation ballooned page. This function releases lru_lock periodically,
this makes migration mostly impossible for some pages.

* balloon_page_dequeue have a tight race with balloon_page_isolate:
balloon_page_isolate could be executed in parallel with dequeue between
picking page from list and locking page_lock. Race is rare because they
use trylock_page() for locking.

This patch fixes all of them.

Instead of fake mapping with special flag this patch uses special state of
page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256. Buddy allocator uses
PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose. Storing mark
directly in struct page makes everything safer and easier.

PagePrivate is used to mark pages present in page list (i.e. not
isolated, like PageLRU for normal pages). It replaces special rules for
reference counter and makes balloon migration similar to migration of
normal pages. This flag is protected by page_lock together with link to
the balloon device.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Link: http://lkml.kernel.org/p/[email protected]
Cc: Rafael Aquini <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/virtio/virtio_balloon.c | 15 ++---
include/linux/balloon_compaction.h | 97 +++++++++----------------------------
include/linux/migrate.h | 11 ----
include/linux/mm.h | 19 +++++++
mm/balloon_compaction.c | 26 ++++-----
mm/compaction.c | 2
mm/migrate.c | 16 +-----
7 files changed, 68 insertions(+), 118 deletions(-)

--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -163,8 +163,8 @@ static void release_pages_by_pfn(const u
/* Find pfns pointing at start of each page, get pages and free them. */
for (i = 0; i < num; i += VIRTIO_BALLOON_PAGES_PER_PAGE) {
struct page *page = balloon_pfn_to_page(pfns[i]);
- balloon_page_free(page);
adjust_managed_page_count(page, 1);
+ put_page(page); /* balloon reference */
}
}

@@ -395,6 +395,8 @@ static int virtballoon_migratepage(struc
if (!mutex_trylock(&vb->balloon_lock))
return -EAGAIN;

+ get_page(newpage); /* balloon reference */
+
/* balloon's page migration 1st step -- inflate "newpage" */
spin_lock_irqsave(&vb_dev_info->pages_lock, flags);
balloon_page_insert(newpage, mapping, &vb_dev_info->pages);
@@ -404,12 +406,7 @@ static int virtballoon_migratepage(struc
set_page_pfns(vb->pfns, newpage);
tell_host(vb, vb->inflate_vq);

- /*
- * balloon's page migration 2nd step -- deflate "page"
- *
- * It's safe to delete page->lru here because this page is at
- * an isolated migration list, and this step is expected to happen here
- */
+ /* balloon's page migration 2nd step -- deflate "page" */
balloon_page_delete(page);
vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
set_page_pfns(vb->pfns, page);
@@ -417,7 +414,9 @@ static int virtballoon_migratepage(struc

mutex_unlock(&vb->balloon_lock);

- return MIGRATEPAGE_BALLOON_SUCCESS;
+ put_page(page); /* balloon reference */
+
+ return MIGRATEPAGE_SUCCESS;
}

/* define the balloon_mapping->a_ops callback to allow balloon page migration */
--- a/include/linux/balloon_compaction.h
+++ b/include/linux/balloon_compaction.h
@@ -27,10 +27,13 @@
* counter raised only while it is under our special handling;
*
* iii. after the lockless scan step have selected a potential balloon page for
- * isolation, re-test the page->mapping flags and the page ref counter
+ * isolation, re-test the PageBalloon mark and the PagePrivate flag
* under the proper page lock, to ensure isolating a valid balloon page
* (not yet isolated, nor under release procedure)
*
+ * iv. isolation or dequeueing procedure must clear PagePrivate flag under
+ * page lock together with removing page from balloon device page list.
+ *
* The functions provided by this interface are placed to help on coping with
* the aforementioned balloon page corner case, as well as to ensure the simple
* set of exposed rules are satisfied while we are dealing with balloon pages
@@ -71,28 +74,6 @@ static inline void balloon_devinfo_free(
kfree(b_dev_info);
}

-/*
- * balloon_page_free - release a balloon page back to the page free lists
- * @page: ballooned page to be set free
- *
- * This function must be used to properly set free an isolated/dequeued balloon
- * page at the end of a sucessful page migration, or at the balloon driver's
- * page release procedure.
- */
-static inline void balloon_page_free(struct page *page)
-{
- /*
- * Balloon pages always get an extra refcount before being isolated
- * and before being dequeued to help on sorting out fortuite colisions
- * between a thread attempting to isolate and another thread attempting
- * to release the very same balloon page.
- *
- * Before we handle the page back to Buddy, lets drop its extra refcnt.
- */
- put_page(page);
- __free_page(page);
-}
-
#ifdef CONFIG_BALLOON_COMPACTION
extern bool balloon_page_isolate(struct page *page);
extern void balloon_page_putback(struct page *page);
@@ -108,74 +89,33 @@ static inline void balloon_mapping_free(
}

/*
- * page_flags_cleared - helper to perform balloon @page ->flags tests.
- *
- * As balloon pages are obtained from buddy and we do not play with page->flags
- * at driver level (exception made when we get the page lock for compaction),
- * we can safely identify a ballooned page by checking if the
- * PAGE_FLAGS_CHECK_AT_PREP page->flags are all cleared. This approach also
- * helps us skip ballooned pages that are locked for compaction or release, thus
- * mitigating their racy check at balloon_page_movable()
- */
-static inline bool page_flags_cleared(struct page *page)
-{
- return !(page->flags & PAGE_FLAGS_CHECK_AT_PREP);
-}
-
-/*
- * __is_movable_balloon_page - helper to perform @page mapping->flags tests
+ * __is_movable_balloon_page - helper to perform @page PageBalloon tests
*/
static inline bool __is_movable_balloon_page(struct page *page)
{
- struct address_space *mapping = page->mapping;
- return mapping_balloon(mapping);
+ return PageBalloon(page);
}

/*
- * balloon_page_movable - test page->mapping->flags to identify balloon pages
- * that can be moved by compaction/migration.
- *
- * This function is used at core compaction's page isolation scheme, therefore
- * most pages exposed to it are not enlisted as balloon pages and so, to avoid
- * undesired side effects like racing against __free_pages(), we cannot afford
- * holding the page locked while testing page->mapping->flags here.
+ * balloon_page_movable - test PageBalloon to identify balloon pages
+ * and PagePrivate to check that the page is not
+ * isolated and can be moved by compaction/migration.
*
* As we might return false positives in the case of a balloon page being just
- * released under us, the page->mapping->flags need to be re-tested later,
- * under the proper page lock, at the functions that will be coping with the
- * balloon page case.
+ * released under us, this need to be re-tested later, under the page lock.
*/
static inline bool balloon_page_movable(struct page *page)
{
- /*
- * Before dereferencing and testing mapping->flags, let's make sure
- * this is not a page that uses ->mapping in a different way
- */
- if (page_flags_cleared(page) && !page_mapped(page) &&
- page_count(page) == 1)
- return __is_movable_balloon_page(page);
-
- return false;
+ return PageBalloon(page) && PagePrivate(page);
}

/*
* isolated_balloon_page - identify an isolated balloon page on private
* compaction/migration page lists.
- *
- * After a compaction thread isolates a balloon page for migration, it raises
- * the page refcount to prevent concurrent compaction threads from re-isolating
- * the same page. For that reason putback_movable_pages(), or other routines
- * that need to identify isolated balloon pages on private pagelists, cannot
- * rely on balloon_page_movable() to accomplish the task.
*/
static inline bool isolated_balloon_page(struct page *page)
{
- /* Already isolated balloon pages, by default, have a raised refcount */
- if (page_flags_cleared(page) && !page_mapped(page) &&
- page_count(page) >= 2)
- return __is_movable_balloon_page(page);
-
- return false;
+ return PageBalloon(page);
}

/*
@@ -192,6 +132,8 @@ static inline void balloon_page_insert(s
struct address_space *mapping,
struct list_head *head)
{
+ __SetPageBalloon(page);
+ SetPagePrivate(page);
page->mapping = mapping;
list_add(&page->lru, head);
}
@@ -206,8 +148,12 @@ static inline void balloon_page_insert(s
*/
static inline void balloon_page_delete(struct page *page)
{
+ __ClearPageBalloon(page);
page->mapping = NULL;
- list_del(&page->lru);
+ if (PagePrivate(page)) {
+ ClearPagePrivate(page);
+ list_del(&page->lru);
+ }
}

/*
@@ -258,6 +204,11 @@ static inline void balloon_page_delete(s
list_del(&page->lru);
}

+static inline bool __is_movable_balloon_page(struct page *page)
+{
+ return false;
+}
+
static inline bool balloon_page_movable(struct page *page)
{
return false;
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -13,18 +13,9 @@ typedef void free_page_t(struct page *pa
* Return values from addresss_space_operations.migratepage():
* - negative errno on page migration failure;
* - zero on page migration success;
- *
- * The balloon page migration introduces this special case where a 'distinct'
- * return code is used to flag a successful page migration to unmap_and_move().
- * This approach is necessary because page migration can race against balloon
- * deflation procedure, and for such case we could introduce a nasty page leak
- * if a successfully migrated balloon page gets released concurrently with
- * migration's unmap_and_move() wrap-up steps.
*/
#define MIGRATEPAGE_SUCCESS 0
-#define MIGRATEPAGE_BALLOON_SUCCESS 1 /* special ret code for balloon page
- * sucessful migration case.
- */
+
enum migrate_reason {
MR_COMPACTION,
MR_MEMORY_FAILURE,
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -553,6 +553,25 @@ static inline void __ClearPageBuddy(stru
atomic_set(&page->_mapcount, -1);
}

+#define PAGE_BALLOON_MAPCOUNT_VALUE (-256)
+
+static inline int PageBalloon(struct page *page)
+{
+ return atomic_read(&page->_mapcount) == PAGE_BALLOON_MAPCOUNT_VALUE;
+}
+
+static inline void __SetPageBalloon(struct page *page)
+{
+ VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
+ atomic_set(&page->_mapcount, PAGE_BALLOON_MAPCOUNT_VALUE);
+}
+
+static inline void __ClearPageBalloon(struct page *page)
+{
+ VM_BUG_ON_PAGE(!PageBalloon(page), page);
+ atomic_set(&page->_mapcount, -1);
+}
+
void put_page(struct page *page);
void put_pages_list(struct list_head *pages);

--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -93,17 +93,12 @@ struct page *balloon_page_dequeue(struct
* to be released by the balloon driver.
*/
if (trylock_page(page)) {
+ if (!PagePrivate(page)) {
+ /* raced with isolation */
+ unlock_page(page);
+ continue;
+ }
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
- /*
- * Raise the page refcount here to prevent any wrong
- * attempt to isolate this page, in case of coliding
- * with balloon_page_isolate() just after we release
- * the page lock.
- *
- * balloon_page_free() will take care of dropping
- * this extra refcount later.
- */
- get_page(page);
balloon_page_delete(page);
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
unlock_page(page);
@@ -187,7 +182,9 @@ static inline void __isolate_balloon_pag
{
struct balloon_dev_info *b_dev_info = page->mapping->private_data;
unsigned long flags;
+
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
+ ClearPagePrivate(page);
list_del(&page->lru);
b_dev_info->isolated_pages++;
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
@@ -197,7 +194,9 @@ static inline void __putback_balloon_pag
{
struct balloon_dev_info *b_dev_info = page->mapping->private_data;
unsigned long flags;
+
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
+ SetPagePrivate(page);
list_add(&page->lru, &b_dev_info->pages);
b_dev_info->isolated_pages--;
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
@@ -235,12 +234,11 @@ bool balloon_page_isolate(struct page *p
*/
if (likely(trylock_page(page))) {
/*
- * A ballooned page, by default, has just one refcount.
+ * A ballooned page, by default, has PagePrivate set.
* Prevent concurrent compaction threads from isolating
- * an already isolated balloon page by refcount check.
+ * an already isolated balloon page by clearing it.
*/
- if (__is_movable_balloon_page(page) &&
- page_count(page) == 2) {
+ if (balloon_page_movable(page)) {
__isolate_balloon_page(page);
unlock_page(page);
return true;
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -597,7 +597,7 @@ isolate_migratepages_range(struct zone *
*/
if (!PageLRU(page)) {
if (unlikely(balloon_page_movable(page))) {
- if (locked && balloon_page_isolate(page)) {
+ if (balloon_page_isolate(page)) {
/* Successfully isolated */
goto isolate_success;
}
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -876,7 +876,7 @@ static int __unmap_and_move(struct page
}
}

- if (unlikely(balloon_page_movable(page))) {
+ if (unlikely(isolated_balloon_page(page))) {
/*
* A ballooned page does not need any special attention from
* physical to virtual reverse mapping procedures.
@@ -955,17 +955,6 @@ static int unmap_and_move(new_page_t get

rc = __unmap_and_move(page, newpage, force, mode);

- if (unlikely(rc == MIGRATEPAGE_BALLOON_SUCCESS)) {
- /*
- * A ballooned page has been migrated already.
- * Now, it's the time to wrap-up counters,
- * handle the page back to Buddy and return.
- */
- dec_zone_page_state(page, NR_ISOLATED_ANON +
- page_is_file_cache(page));
- balloon_page_free(page);
- return MIGRATEPAGE_SUCCESS;
- }
out:
if (rc != -EAGAIN) {
/*
@@ -988,6 +977,9 @@ out:
if (rc != MIGRATEPAGE_SUCCESS && put_new_page) {
ClearPageSwapBacked(newpage);
put_new_page(newpage, private);
+ } else if (unlikely(__is_movable_balloon_page(newpage))) {
+ /* drop our reference, page already in the balloon */
+ put_page(newpage);
} else
putback_lru_page(newpage);


2014-10-28 03:41:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 103/146] Bluetooth: 6lowpan: Increase the connection timeout value

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jukka Rissanen <[email protected]>

commit 2ae50d8d3aaf7154f72b44331b71f15799cdc1bb upstream.

Use the default connection timeout value defined in l2cap.h because
the current timeout was too short and most of the time the connection
attempts timed out.

Signed-off-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/bluetooth/6lowpan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -890,7 +890,7 @@ static void chan_resume_cb(struct l2cap_

static long chan_get_sndtimeo_cb(struct l2cap_chan *chan)
{
- return msecs_to_jiffies(1000);
+ return L2CAP_CONN_TIMEOUT;
}

static const struct l2cap_ops bt_6lowpan_chan_ops = {

2014-10-28 03:41:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 136/146] sparc64: Define VA hole at run time, rather than at compile time.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit 4397bed080598001e88f612deb8b080bb1cc2322 ]

Now that we use 4-level page tables, we can provide up to 53-bits of
virtual address space to the user.

Adjust the VA hole based upon the capabilities of the cpu type probed.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/page_64.h | 15 ++++-----------
arch/sparc/mm/init_64.c | 21 +++++++++++++++++++++
2 files changed, 25 insertions(+), 11 deletions(-)

--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -102,21 +102,14 @@ typedef unsigned long pgprot_t;

typedef pte_t *pgtable_t;

-/* These two values define the virtual address space range in which we
- * must forbid 64-bit user processes from making mappings. It used to
- * represent precisely the virtual address space hole present in most
- * early sparc64 chips including UltraSPARC-I. But now it also is
- * further constrained by the limits of our page tables, which is
- * 43-bits of virtual address.
- */
-#define SPARC64_VA_HOLE_TOP _AC(0xfffffc0000000000,UL)
-#define SPARC64_VA_HOLE_BOTTOM _AC(0x0000040000000000,UL)
+extern unsigned long sparc64_va_hole_top;
+extern unsigned long sparc64_va_hole_bottom;

/* The next two defines specify the actual exclusion region we
* enforce, wherein we use a 4GB red zone on each side of the VA hole.
*/
-#define VA_EXCLUDE_START (SPARC64_VA_HOLE_BOTTOM - (1UL << 32UL))
-#define VA_EXCLUDE_END (SPARC64_VA_HOLE_TOP + (1UL << 32UL))
+#define VA_EXCLUDE_START (sparc64_va_hole_bottom - (1UL << 32UL))
+#define VA_EXCLUDE_END (sparc64_va_hole_top + (1UL << 32UL))

#define TASK_UNMAPPED_BASE (test_thread_flag(TIF_32BIT) ? \
_AC(0x0000000070000000,UL) : \
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -1630,25 +1630,46 @@ static void __init page_offset_shift_pat
}
}

+unsigned long sparc64_va_hole_top = 0xfffff80000000000UL;
+unsigned long sparc64_va_hole_bottom = 0x0000080000000000UL;
+
static void __init setup_page_offset(void)
{
unsigned long max_phys_bits = 40;

if (tlb_type == cheetah || tlb_type == cheetah_plus) {
+ /* Cheetah/Panther support a full 64-bit virtual
+ * address, so we can use all that our page tables
+ * support.
+ */
+ sparc64_va_hole_top = 0xfff0000000000000UL;
+ sparc64_va_hole_bottom = 0x0010000000000000UL;
+
max_phys_bits = 42;
} else if (tlb_type == hypervisor) {
switch (sun4v_chip_type) {
case SUN4V_CHIP_NIAGARA1:
case SUN4V_CHIP_NIAGARA2:
+ /* T1 and T2 support 48-bit virtual addresses. */
+ sparc64_va_hole_top = 0xffff800000000000UL;
+ sparc64_va_hole_bottom = 0x0000800000000000UL;
+
max_phys_bits = 39;
break;
case SUN4V_CHIP_NIAGARA3:
+ /* T3 supports 48-bit virtual addresses. */
+ sparc64_va_hole_top = 0xffff800000000000UL;
+ sparc64_va_hole_bottom = 0x0000800000000000UL;
+
max_phys_bits = 43;
break;
case SUN4V_CHIP_NIAGARA4:
case SUN4V_CHIP_NIAGARA5:
case SUN4V_CHIP_SPARC64X:
default:
+ /* T4 and later support 52-bit virtual addresses. */
+ sparc64_va_hole_top = 0xfff8000000000000UL;
+ sparc64_va_hole_bottom = 0x0008000000000000UL;
max_phys_bits = 47;
break;
}

2014-10-28 03:41:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 140/146] sparc64: Increase MAX_PHYS_ADDRESS_BITS to 53.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

Make sure, at compile time, that the kernel can properly support
whatever MAX_PHYS_ADDRESS_BITS is defined to.

On M7 chips, use a max_phys_bits value of 49.

Based upon a patch by Bob Picco.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/page_64.h | 8 ++++----
arch/sparc/include/asm/pgtable_64.h | 4 ++++
arch/sparc/mm/init_64.c | 9 ++++++++-
3 files changed, 16 insertions(+), 5 deletions(-)

--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -122,11 +122,11 @@ extern unsigned long PAGE_OFFSET;

#endif /* !(__ASSEMBLY__) */

-/* The maximum number of physical memory address bits we support, this
- * is used to size various tables used to manage kernel TLB misses and
- * also the sparsemem code.
+/* The maximum number of physical memory address bits we support. The
+ * largest value we can support is whatever "KPGD_SHIFT + KPTE_BITS"
+ * evaluates to.
*/
-#define MAX_PHYS_ADDRESS_BITS 47
+#define MAX_PHYS_ADDRESS_BITS 53

#define ILOG2_4MB 22
#define ILOG2_256MB 28
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -67,6 +67,10 @@
#define PGDIR_MASK (~(PGDIR_SIZE-1))
#define PGDIR_BITS (PAGE_SHIFT - 3)

+#if (MAX_PHYS_ADDRESS_BITS > PGDIR_SHIFT + PGDIR_BITS)
+#error MAX_PHYS_ADDRESS_BITS exceeds what kernel page tables can support
+#endif
+
#if (PGDIR_SHIFT + PGDIR_BITS) != 53
#error Page table parameters do not cover virtual address space properly.
#endif
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -1690,12 +1690,19 @@ static void __init setup_page_offset(voi
case SUN4V_CHIP_NIAGARA4:
case SUN4V_CHIP_NIAGARA5:
case SUN4V_CHIP_SPARC64X:
- default:
+ case SUN4V_CHIP_SPARC_M6:
/* T4 and later support 52-bit virtual addresses. */
sparc64_va_hole_top = 0xfff8000000000000UL;
sparc64_va_hole_bottom = 0x0008000000000000UL;
max_phys_bits = 47;
break;
+ case SUN4V_CHIP_SPARC_M7:
+ default:
+ /* M7 and later support 52-bit virtual addresses. */
+ sparc64_va_hole_top = 0xfff8000000000000UL;
+ sparc64_va_hole_bottom = 0x0008000000000000UL;
+ max_phys_bits = 49;
+ break;
}
}


2014-10-28 03:41:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 142/146] sparc64: sparse irq

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: bob picco <[email protected]>

[ Upstream commit ee6a9333fa58e11577c1b531b8e0f5ffc0fd6f50 ]

This patch attempts to do a few things. The highlights are: 1) enable
SPARSE_IRQ unconditionally, 2) kills off !SPARSE_IRQ code 3) allocates
ivector_table at boot time and 4) default to cookie only VIRQ mechanism
for supported firmware. The first firmware with cookie only support for
me appears on T5. You can optionally force the HV firmware to not cookie
only mode which is the sysino support.

The sysino is a deprecated HV mechanism according to the most recent
SPARC Virtual Machine Specification. HV_GRP_INTR is what controls the
cookie/sysino firmware versioning.

The history of this interface is:

1) Major version 1.0 only supported sysino based interrupt interfaces.

2) Major version 2.0 added cookie based VIRQs, however due to the fact
that OSs were using the VIRQs without negoatiating major version
2.0 (Linux and Solaris are both guilty), the VIRQs calls were
allowed even with major version 1.0

To complicate things even further, the VIRQ interfaces were only
actually hooked up in the hypervisor for LDC interrupt sources.
VIRQ calls on other device types would result in HV_EINVAL errors.

So effectively, major version 2.0 is unusable.

3) Major version 3.0 was created to signal use of VIRQs and the fact
that the hypervisor has these calls hooked up for all interrupt
sources, not just those for LDC devices.

A new boot option is provided should cookie only HV support have issues.
hvirq - this is the version for HV_GRP_INTR. This is related to HV API
versioning. The code attempts major=3 first by default. The option can
be used to override this default.

I've tested with SPARSE_IRQ on T5-8, M7-4 and T4-X and Jalap?no.

Signed-off-by: Bob Picco <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/Kconfig | 1
arch/sparc/include/asm/irq_64.h | 7
arch/sparc/kernel/irq_64.c | 509 ++++++++++++++++++++++++++--------------
3 files changed, 342 insertions(+), 175 deletions(-)

--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -67,6 +67,7 @@ config SPARC64
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_CONTEXT_TRACKING
select HAVE_DEBUG_KMEMLEAK
+ select SPARSE_IRQ
select RTC_DRV_CMOS
select RTC_DRV_BQ4802
select RTC_DRV_SUN4V
--- a/arch/sparc/include/asm/irq_64.h
+++ b/arch/sparc/include/asm/irq_64.h
@@ -37,7 +37,7 @@
*
* ino_bucket->irq allocation is made during {sun4v_,}build_irq().
*/
-#define NR_IRQS 255
+#define NR_IRQS (2048)

void irq_install_pre_handler(int irq,
void (*func)(unsigned int, void *, void *),
@@ -57,11 +57,8 @@ unsigned int sun4u_build_msi(u32 portid,
unsigned long iclr_base);
void sun4u_destroy_msi(unsigned int irq);

-unsigned char irq_alloc(unsigned int dev_handle,
- unsigned int dev_ino);
-#ifdef CONFIG_PCI_MSI
+unsigned int irq_alloc(unsigned int dev_handle, unsigned int dev_ino);
void irq_free(unsigned int irq);
-#endif

void __init init_IRQ(void);
void fixup_irqs(void);
--- a/arch/sparc/kernel/irq_64.c
+++ b/arch/sparc/kernel/irq_64.c
@@ -47,8 +47,6 @@
#include "cpumap.h"
#include "kstack.h"

-#define NUM_IVECS (IMAP_INR + 1)
-
struct ino_bucket *ivector_table;
unsigned long ivector_table_pa;

@@ -107,55 +105,196 @@ static void bucket_set_irq(unsigned long

#define irq_work_pa(__cpu) &(trap_block[(__cpu)].irq_worklist_pa)

-static struct {
- unsigned int dev_handle;
- unsigned int dev_ino;
- unsigned int in_use;
-} irq_table[NR_IRQS];
-static DEFINE_SPINLOCK(irq_alloc_lock);
+static unsigned long hvirq_major __initdata;
+static int __init early_hvirq_major(char *p)
+{
+ int rc = kstrtoul(p, 10, &hvirq_major);
+
+ return rc;
+}
+early_param("hvirq", early_hvirq_major);
+
+static int hv_irq_version;
+
+/* Major version 2.0 of HV_GRP_INTR added support for the VIRQ cookie
+ * based interfaces, but:
+ *
+ * 1) Several OSs, Solaris and Linux included, use them even when only
+ * negotiating version 1.0 (or failing to negotiate at all). So the
+ * hypervisor has a workaround that provides the VIRQ interfaces even
+ * when only verion 1.0 of the API is in use.
+ *
+ * 2) Second, and more importantly, with major version 2.0 these VIRQ
+ * interfaces only were actually hooked up for LDC interrupts, even
+ * though the Hypervisor specification clearly stated:
+ *
+ * The new interrupt API functions will be available to a guest
+ * when it negotiates version 2.0 in the interrupt API group 0x2. When
+ * a guest negotiates version 2.0, all interrupt sources will only
+ * support using the cookie interface, and any attempt to use the
+ * version 1.0 interrupt APIs numbered 0xa0 to 0xa6 will result in the
+ * ENOTSUPPORTED error being returned.
+ *
+ * with an emphasis on "all interrupt sources".
+ *
+ * To correct this, major version 3.0 was created which does actually
+ * support VIRQs for all interrupt sources (not just LDC devices). So
+ * if we want to move completely over the cookie based VIRQs we must
+ * negotiate major version 3.0 or later of HV_GRP_INTR.
+ */
+static bool sun4v_cookie_only_virqs(void)
+{
+ if (hv_irq_version >= 3)
+ return true;
+ return false;
+}

-unsigned char irq_alloc(unsigned int dev_handle, unsigned int dev_ino)
+static void __init irq_init_hv(void)
{
- unsigned long flags;
- unsigned char ent;
+ unsigned long hv_error, major, minor = 0;
+
+ if (tlb_type != hypervisor)
+ return;

- BUILD_BUG_ON(NR_IRQS >= 256);
+ if (hvirq_major)
+ major = hvirq_major;
+ else
+ major = 3;

- spin_lock_irqsave(&irq_alloc_lock, flags);
+ hv_error = sun4v_hvapi_register(HV_GRP_INTR, major, &minor);
+ if (!hv_error)
+ hv_irq_version = major;
+ else
+ hv_irq_version = 1;

- for (ent = 1; ent < NR_IRQS; ent++) {
- if (!irq_table[ent].in_use)
+ pr_info("SUN4V: Using IRQ API major %d, cookie only virqs %s\n",
+ hv_irq_version,
+ sun4v_cookie_only_virqs() ? "enabled" : "disabled");
+}
+
+/* This function is for the timer interrupt.*/
+int __init arch_probe_nr_irqs(void)
+{
+ return 1;
+}
+
+#define DEFAULT_NUM_IVECS (0xfffU)
+static unsigned int nr_ivec = DEFAULT_NUM_IVECS;
+#define NUM_IVECS (nr_ivec)
+
+static unsigned int __init size_nr_ivec(void)
+{
+ if (tlb_type == hypervisor) {
+ switch (sun4v_chip_type) {
+ /* Athena's devhandle|devino is large.*/
+ case SUN4V_CHIP_SPARC64X:
+ nr_ivec = 0xffff;
break;
+ }
}
- if (ent >= NR_IRQS) {
- printk(KERN_ERR "IRQ: Out of virtual IRQs.\n");
- ent = 0;
- } else {
- irq_table[ent].dev_handle = dev_handle;
- irq_table[ent].dev_ino = dev_ino;
- irq_table[ent].in_use = 1;
- }
+ return nr_ivec;
+}
+
+struct irq_handler_data {
+ union {
+ struct {
+ unsigned int dev_handle;
+ unsigned int dev_ino;
+ };
+ unsigned long sysino;
+ };
+ struct ino_bucket bucket;
+ unsigned long iclr;
+ unsigned long imap;
+};
+
+static inline unsigned int irq_data_to_handle(struct irq_data *data)
+{
+ struct irq_handler_data *ihd = data->handler_data;
+
+ return ihd->dev_handle;
+}
+
+static inline unsigned int irq_data_to_ino(struct irq_data *data)
+{
+ struct irq_handler_data *ihd = data->handler_data;
+
+ return ihd->dev_ino;
+}

- spin_unlock_irqrestore(&irq_alloc_lock, flags);
+static inline unsigned long irq_data_to_sysino(struct irq_data *data)
+{
+ struct irq_handler_data *ihd = data->handler_data;

- return ent;
+ return ihd->sysino;
}

-#ifdef CONFIG_PCI_MSI
void irq_free(unsigned int irq)
{
- unsigned long flags;
+ void *data = irq_get_handler_data(irq);

- if (irq >= NR_IRQS)
- return;
+ kfree(data);
+ irq_set_handler_data(irq, NULL);
+ irq_free_descs(irq, 1);
+}

- spin_lock_irqsave(&irq_alloc_lock, flags);
+unsigned int irq_alloc(unsigned int dev_handle, unsigned int dev_ino)
+{
+ int irq;

- irq_table[irq].in_use = 0;
+ irq = __irq_alloc_descs(-1, 1, 1, numa_node_id(), NULL);
+ if (irq <= 0)
+ goto out;

- spin_unlock_irqrestore(&irq_alloc_lock, flags);
+ return irq;
+out:
+ return 0;
+}
+
+static unsigned int cookie_exists(u32 devhandle, unsigned int devino)
+{
+ unsigned long hv_err, cookie;
+ struct ino_bucket *bucket;
+ unsigned int irq = 0U;
+
+ hv_err = sun4v_vintr_get_cookie(devhandle, devino, &cookie);
+ if (hv_err) {
+ pr_err("HV get cookie failed hv_err = %ld\n", hv_err);
+ goto out;
+ }
+
+ if (cookie & ((1UL << 63UL))) {
+ cookie = ~cookie;
+ bucket = (struct ino_bucket *) __va(cookie);
+ irq = bucket->__irq;
+ }
+out:
+ return irq;
+}
+
+static unsigned int sysino_exists(u32 devhandle, unsigned int devino)
+{
+ unsigned long sysino = sun4v_devino_to_sysino(devhandle, devino);
+ struct ino_bucket *bucket;
+ unsigned int irq;
+
+ bucket = &ivector_table[sysino];
+ irq = bucket_get_irq(__pa(bucket));
+
+ return irq;
+}
+
+void ack_bad_irq(unsigned int irq)
+{
+ pr_crit("BAD IRQ ack %d\n", irq);
+}
+
+void irq_install_pre_handler(int irq,
+ void (*func)(unsigned int, void *, void *),
+ void *arg1, void *arg2)
+{
+ pr_warn("IRQ pre handler NOT supported.\n");
}
-#endif

/*
* /proc/interrupts printing:
@@ -206,15 +345,6 @@ static unsigned int sun4u_compute_tid(un
return tid;
}

-struct irq_handler_data {
- unsigned long iclr;
- unsigned long imap;
-
- void (*pre_handler)(unsigned int, void *, void *);
- void *arg1;
- void *arg2;
-};
-
#ifdef CONFIG_SMP
static int irq_choose_cpu(unsigned int irq, const struct cpumask *affinity)
{
@@ -316,8 +446,8 @@ static void sun4u_irq_eoi(struct irq_dat

static void sun4v_irq_enable(struct irq_data *data)
{
- unsigned int ino = irq_table[data->irq].dev_ino;
unsigned long cpuid = irq_choose_cpu(data->irq, data->affinity);
+ unsigned int ino = irq_data_to_sysino(data);
int err;

err = sun4v_intr_settarget(ino, cpuid);
@@ -337,8 +467,8 @@ static void sun4v_irq_enable(struct irq_
static int sun4v_set_affinity(struct irq_data *data,
const struct cpumask *mask, bool force)
{
- unsigned int ino = irq_table[data->irq].dev_ino;
unsigned long cpuid = irq_choose_cpu(data->irq, mask);
+ unsigned int ino = irq_data_to_sysino(data);
int err;

err = sun4v_intr_settarget(ino, cpuid);
@@ -351,7 +481,7 @@ static int sun4v_set_affinity(struct irq

static void sun4v_irq_disable(struct irq_data *data)
{
- unsigned int ino = irq_table[data->irq].dev_ino;
+ unsigned int ino = irq_data_to_sysino(data);
int err;

err = sun4v_intr_setenabled(ino, HV_INTR_DISABLED);
@@ -362,7 +492,7 @@ static void sun4v_irq_disable(struct irq

static void sun4v_irq_eoi(struct irq_data *data)
{
- unsigned int ino = irq_table[data->irq].dev_ino;
+ unsigned int ino = irq_data_to_sysino(data);
int err;

err = sun4v_intr_setstate(ino, HV_INTR_STATE_IDLE);
@@ -373,14 +503,13 @@ static void sun4v_irq_eoi(struct irq_dat

static void sun4v_virq_enable(struct irq_data *data)
{
- unsigned long cpuid, dev_handle, dev_ino;
+ unsigned long dev_handle = irq_data_to_handle(data);
+ unsigned long dev_ino = irq_data_to_ino(data);
+ unsigned long cpuid;
int err;

cpuid = irq_choose_cpu(data->irq, data->affinity);

- dev_handle = irq_table[data->irq].dev_handle;
- dev_ino = irq_table[data->irq].dev_ino;
-
err = sun4v_vintr_set_target(dev_handle, dev_ino, cpuid);
if (err != HV_EOK)
printk(KERN_ERR "sun4v_vintr_set_target(%lx,%lx,%lu): "
@@ -403,14 +532,13 @@ static void sun4v_virq_enable(struct irq
static int sun4v_virt_set_affinity(struct irq_data *data,
const struct cpumask *mask, bool force)
{
- unsigned long cpuid, dev_handle, dev_ino;
+ unsigned long dev_handle = irq_data_to_handle(data);
+ unsigned long dev_ino = irq_data_to_ino(data);
+ unsigned long cpuid;
int err;

cpuid = irq_choose_cpu(data->irq, mask);

- dev_handle = irq_table[data->irq].dev_handle;
- dev_ino = irq_table[data->irq].dev_ino;
-
err = sun4v_vintr_set_target(dev_handle, dev_ino, cpuid);
if (err != HV_EOK)
printk(KERN_ERR "sun4v_vintr_set_target(%lx,%lx,%lu): "
@@ -422,11 +550,10 @@ static int sun4v_virt_set_affinity(struc

static void sun4v_virq_disable(struct irq_data *data)
{
- unsigned long dev_handle, dev_ino;
+ unsigned long dev_handle = irq_data_to_handle(data);
+ unsigned long dev_ino = irq_data_to_ino(data);
int err;

- dev_handle = irq_table[data->irq].dev_handle;
- dev_ino = irq_table[data->irq].dev_ino;

err = sun4v_vintr_set_valid(dev_handle, dev_ino,
HV_INTR_DISABLED);
@@ -438,12 +565,10 @@ static void sun4v_virq_disable(struct ir

static void sun4v_virq_eoi(struct irq_data *data)
{
- unsigned long dev_handle, dev_ino;
+ unsigned long dev_handle = irq_data_to_handle(data);
+ unsigned long dev_ino = irq_data_to_ino(data);
int err;

- dev_handle = irq_table[data->irq].dev_handle;
- dev_ino = irq_table[data->irq].dev_ino;
-
err = sun4v_vintr_set_state(dev_handle, dev_ino,
HV_INTR_STATE_IDLE);
if (err != HV_EOK)
@@ -479,31 +604,10 @@ static struct irq_chip sun4v_virq = {
.flags = IRQCHIP_EOI_IF_HANDLED,
};

-static void pre_flow_handler(struct irq_data *d)
-{
- struct irq_handler_data *handler_data = irq_data_get_irq_handler_data(d);
- unsigned int ino = irq_table[d->irq].dev_ino;
-
- handler_data->pre_handler(ino, handler_data->arg1, handler_data->arg2);
-}
-
-void irq_install_pre_handler(int irq,
- void (*func)(unsigned int, void *, void *),
- void *arg1, void *arg2)
-{
- struct irq_handler_data *handler_data = irq_get_handler_data(irq);
-
- handler_data->pre_handler = func;
- handler_data->arg1 = arg1;
- handler_data->arg2 = arg2;
-
- __irq_set_preflow_handler(irq, pre_flow_handler);
-}
-
unsigned int build_irq(int inofixup, unsigned long iclr, unsigned long imap)
{
- struct ino_bucket *bucket;
struct irq_handler_data *handler_data;
+ struct ino_bucket *bucket;
unsigned int irq;
int ino;

@@ -537,119 +641,166 @@ out:
return irq;
}

-static unsigned int sun4v_build_common(unsigned long sysino,
- struct irq_chip *chip)
+static unsigned int sun4v_build_common(u32 devhandle, unsigned int devino,
+ void (*handler_data_init)(struct irq_handler_data *data,
+ u32 devhandle, unsigned int devino),
+ struct irq_chip *chip)
{
- struct ino_bucket *bucket;
- struct irq_handler_data *handler_data;
+ struct irq_handler_data *data;
unsigned int irq;

- BUG_ON(tlb_type != hypervisor);
+ irq = irq_alloc(devhandle, devino);
+ if (!irq)
+ goto out;

- bucket = &ivector_table[sysino];
- irq = bucket_get_irq(__pa(bucket));
- if (!irq) {
- irq = irq_alloc(0, sysino);
- bucket_set_irq(__pa(bucket), irq);
- irq_set_chip_and_handler_name(irq, chip, handle_fasteoi_irq,
- "IVEC");
+ data = kzalloc(sizeof(struct irq_handler_data), GFP_ATOMIC);
+ if (unlikely(!data)) {
+ pr_err("IRQ handler data allocation failed.\n");
+ irq_free(irq);
+ irq = 0;
+ goto out;
}

- handler_data = irq_get_handler_data(irq);
- if (unlikely(handler_data))
- goto out;
+ irq_set_handler_data(irq, data);
+ handler_data_init(data, devhandle, devino);
+ irq_set_chip_and_handler_name(irq, chip, handle_fasteoi_irq, "IVEC");
+ data->imap = ~0UL;
+ data->iclr = ~0UL;
+out:
+ return irq;
+}

- handler_data = kzalloc(sizeof(struct irq_handler_data), GFP_ATOMIC);
- if (unlikely(!handler_data)) {
- prom_printf("IRQ: kzalloc(irq_handler_data) failed.\n");
- prom_halt();
- }
- irq_set_handler_data(irq, handler_data);
+static unsigned long cookie_assign(unsigned int irq, u32 devhandle,
+ unsigned int devino)
+{
+ struct irq_handler_data *ihd = irq_get_handler_data(irq);
+ unsigned long hv_error, cookie;

- /* Catch accidental accesses to these things. IMAP/ICLR handling
- * is done by hypervisor calls on sun4v platforms, not by direct
- * register accesses.
+ /* handler_irq needs to find the irq. cookie is seen signed in
+ * sun4v_dev_mondo and treated as a non ivector_table delivery.
*/
- handler_data->imap = ~0UL;
- handler_data->iclr = ~0UL;
+ ihd->bucket.__irq = irq;
+ cookie = ~__pa(&ihd->bucket);

-out:
- return irq;
+ hv_error = sun4v_vintr_set_cookie(devhandle, devino, cookie);
+ if (hv_error)
+ pr_err("HV vintr set cookie failed = %ld\n", hv_error);
+
+ return hv_error;
}

-unsigned int sun4v_build_irq(u32 devhandle, unsigned int devino)
+static void cookie_handler_data(struct irq_handler_data *data,
+ u32 devhandle, unsigned int devino)
{
- unsigned long sysino = sun4v_devino_to_sysino(devhandle, devino);
+ data->dev_handle = devhandle;
+ data->dev_ino = devino;
+}

- return sun4v_build_common(sysino, &sun4v_irq);
+static unsigned int cookie_build_irq(u32 devhandle, unsigned int devino,
+ struct irq_chip *chip)
+{
+ unsigned long hv_error;
+ unsigned int irq;
+
+ irq = sun4v_build_common(devhandle, devino, cookie_handler_data, chip);
+
+ hv_error = cookie_assign(irq, devhandle, devino);
+ if (hv_error) {
+ irq_free(irq);
+ irq = 0;
+ }
+
+ return irq;
}

-unsigned int sun4v_build_virq(u32 devhandle, unsigned int devino)
+static unsigned int sun4v_build_cookie(u32 devhandle, unsigned int devino)
{
- struct irq_handler_data *handler_data;
- unsigned long hv_err, cookie;
- struct ino_bucket *bucket;
unsigned int irq;

- bucket = kzalloc(sizeof(struct ino_bucket), GFP_ATOMIC);
- if (unlikely(!bucket))
- return 0;
-
- /* The only reference we store to the IRQ bucket is
- * by physical address which kmemleak can't see, tell
- * it that this object explicitly is not a leak and
- * should be scanned.
- */
- kmemleak_not_leak(bucket);
+ irq = cookie_exists(devhandle, devino);
+ if (irq)
+ goto out;

- __flush_dcache_range((unsigned long) bucket,
- ((unsigned long) bucket +
- sizeof(struct ino_bucket)));
+ irq = cookie_build_irq(devhandle, devino, &sun4v_virq);

- irq = irq_alloc(devhandle, devino);
+out:
+ return irq;
+}
+
+static void sysino_set_bucket(unsigned int irq)
+{
+ struct irq_handler_data *ihd = irq_get_handler_data(irq);
+ struct ino_bucket *bucket;
+ unsigned long sysino;
+
+ sysino = sun4v_devino_to_sysino(ihd->dev_handle, ihd->dev_ino);
+ BUG_ON(sysino >= nr_ivec);
+ bucket = &ivector_table[sysino];
bucket_set_irq(__pa(bucket), irq);
+}

- irq_set_chip_and_handler_name(irq, &sun4v_virq, handle_fasteoi_irq,
- "IVEC");
+static void sysino_handler_data(struct irq_handler_data *data,
+ u32 devhandle, unsigned int devino)
+{
+ unsigned long sysino;

- handler_data = kzalloc(sizeof(struct irq_handler_data), GFP_ATOMIC);
- if (unlikely(!handler_data))
- return 0;
+ sysino = sun4v_devino_to_sysino(devhandle, devino);
+ data->sysino = sysino;
+}

- /* In order to make the LDC channel startup sequence easier,
- * especially wrt. locking, we do not let request_irq() enable
- * the interrupt.
- */
- irq_set_status_flags(irq, IRQ_NOAUTOEN);
- irq_set_handler_data(irq, handler_data);
+static unsigned int sysino_build_irq(u32 devhandle, unsigned int devino,
+ struct irq_chip *chip)
+{
+ unsigned int irq;

- /* Catch accidental accesses to these things. IMAP/ICLR handling
- * is done by hypervisor calls on sun4v platforms, not by direct
- * register accesses.
- */
- handler_data->imap = ~0UL;
- handler_data->iclr = ~0UL;
+ irq = sun4v_build_common(devhandle, devino, sysino_handler_data, chip);
+ if (!irq)
+ goto out;

- cookie = ~__pa(bucket);
- hv_err = sun4v_vintr_set_cookie(devhandle, devino, cookie);
- if (hv_err) {
- prom_printf("IRQ: Fatal, cannot set cookie for [%x:%x] "
- "err=%lu\n", devhandle, devino, hv_err);
- prom_halt();
- }
+ sysino_set_bucket(irq);
+out:
+ return irq;
+}

+static int sun4v_build_sysino(u32 devhandle, unsigned int devino)
+{
+ int irq;
+
+ irq = sysino_exists(devhandle, devino);
+ if (irq)
+ goto out;
+
+ irq = sysino_build_irq(devhandle, devino, &sun4v_irq);
+out:
return irq;
}

-void ack_bad_irq(unsigned int irq)
+unsigned int sun4v_build_irq(u32 devhandle, unsigned int devino)
{
- unsigned int ino = irq_table[irq].dev_ino;
+ unsigned int irq;

- if (!ino)
- ino = 0xdeadbeef;
+ if (sun4v_cookie_only_virqs())
+ irq = sun4v_build_cookie(devhandle, devino);
+ else
+ irq = sun4v_build_sysino(devhandle, devino);

- printk(KERN_CRIT "Unexpected IRQ from ino[%x] irq[%u]\n",
- ino, irq);
+ return irq;
+}
+
+unsigned int sun4v_build_virq(u32 devhandle, unsigned int devino)
+{
+ int irq;
+
+ irq = cookie_build_irq(devhandle, devino, &sun4v_virq);
+ if (!irq)
+ goto out;
+
+ /* This is borrowed from the original function.
+ */
+ irq_set_status_flags(irq, IRQ_NOAUTOEN);
+
+out:
+ return irq;
}

void *hardirq_stack[NR_CPUS];
@@ -720,9 +871,12 @@ void fixup_irqs(void)

for (irq = 0; irq < NR_IRQS; irq++) {
struct irq_desc *desc = irq_to_desc(irq);
- struct irq_data *data = irq_desc_get_irq_data(desc);
+ struct irq_data *data;
unsigned long flags;

+ if (!desc)
+ continue;
+ data = irq_desc_get_irq_data(desc);
raw_spin_lock_irqsave(&desc->lock, flags);
if (desc->action && !irqd_is_per_cpu(data)) {
if (data->chip->irq_set_affinity)
@@ -922,16 +1076,22 @@ static struct irqaction timer_irq_action
.name = "timer",
};

-/* Only invoked on boot processor. */
-void __init init_IRQ(void)
+static void __init irq_ivector_init(void)
{
- unsigned long size;
+ unsigned long size, order;
+ unsigned int ivecs;

- map_prom_timers();
- kill_prom_timer();
+ /* If we are doing cookie only VIRQs then we do not need the ivector
+ * table to process interrupts.
+ */
+ if (sun4v_cookie_only_virqs())
+ return;

- size = sizeof(struct ino_bucket) * NUM_IVECS;
- ivector_table = kzalloc(size, GFP_KERNEL);
+ ivecs = size_nr_ivec();
+ size = sizeof(struct ino_bucket) * ivecs;
+ order = get_order(size);
+ ivector_table = (struct ino_bucket *)
+ __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
if (!ivector_table) {
prom_printf("Fatal error, cannot allocate ivector_table\n");
prom_halt();
@@ -940,6 +1100,15 @@ void __init init_IRQ(void)
((unsigned long) ivector_table) + size);

ivector_table_pa = __pa(ivector_table);
+}
+
+/* Only invoked on boot processor.*/
+void __init init_IRQ(void)
+{
+ irq_init_hv();
+ irq_ivector_init();
+ map_prom_timers();
+ kill_prom_timer();

if (tlb_type == hypervisor)
sun4v_init_mondo_queues();

2014-10-28 03:41:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 144/146] sparc64: Increase size of boot string to 1024 bytes

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Kleikamp <[email protected]>

[ Upstream commit 1cef94c36bd4d79b5ae3a3df99ee0d76d6a4a6dc ]

This is the longest boot string that silo supports.

Signed-off-by: Dave Kleikamp <[email protected]>
Cc: Bob Picco <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/prom/bootstr_64.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/sparc/prom/bootstr_64.c
+++ b/arch/sparc/prom/bootstr_64.c
@@ -14,7 +14,10 @@
* the .bss section or it will break things.
*/

-#define BARG_LEN 256
+/* We limit BARG_LEN to 1024 because this is the size of the
+ * 'barg_out' command line buffer in the SILO bootloader.
+ */
+#define BARG_LEN 1024
struct {
int bootstr_len;
int bootstr_valid;

2014-10-28 03:41:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 128/146] sparc64: Fix FPU register corruption with AES crypto offload.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit f4da3628dc7c32a59d1fb7116bb042e6f436d611 ]

The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
key material is preloaded into the FPU registers, and then we loop
over and over doing the crypt operation, reusing those pre-cooked key
registers.

There are intervening blkcipher*() calls between the crypt operation
calls. And those might perform memcpy() and thus also try to use the
FPU.

The sparc64 kernel FPU usage mechanism is designed to allow such
recursive uses, but with a catch.

There has to be a trap between the two FPU using threads of control.

The mechanism works by, when the FPU is already in use by the kernel,
allocating a slot for FPU saving at trap time. Then if, within the
trap handler, we try to use the FPU registers, the pre-trap FPU
register state is saved into the slot. Then at trap return time we
notice this and restore the pre-trap FPU state.

Over the long term there are various more involved ways we can make
this work, but for a quick fix let's take advantage of the fact that
the situation where this happens is very limited.

All sparc64 chips that support the crypto instructiosn also are using
the Niagara4 memcpy routine, and that routine only uses the FPU for
large copies where we can't get the source aligned properly to a
multiple of 8 bytes.

We look to see if the FPU is already in use in this context, and if so
we use the non-large copy path which only uses integer registers.

Furthermore, we also limit this special logic to when we are doing
kernel copy, rather than a user copy.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/visasm.h | 8 ++++++++
arch/sparc/lib/NG4memcpy.S | 14 +++++++++++++-
2 files changed, 21 insertions(+), 1 deletion(-)

--- a/arch/sparc/include/asm/visasm.h
+++ b/arch/sparc/include/asm/visasm.h
@@ -39,6 +39,14 @@
297: wr %o5, FPRS_FEF, %fprs; \
298:

+#define VISEntryHalfFast(fail_label) \
+ rd %fprs, %o5; \
+ andcc %o5, FPRS_FEF, %g0; \
+ be,pt %icc, 297f; \
+ nop; \
+ ba,a,pt %xcc, fail_label; \
+297: wr %o5, FPRS_FEF, %fprs;
+
#define VISExitHalf \
wr %o5, 0, %fprs;

--- a/arch/sparc/lib/NG4memcpy.S
+++ b/arch/sparc/lib/NG4memcpy.S
@@ -41,6 +41,10 @@
#endif
#endif

+#if !defined(EX_LD) && !defined(EX_ST)
+#define NON_USER_COPY
+#endif
+
#ifndef EX_LD
#define EX_LD(x) x
#endif
@@ -197,9 +201,13 @@ FUNC_NAME: /* %o0=dst, %o1=src, %o2=len
mov EX_RETVAL(%o3), %o0

.Llarge_src_unaligned:
+#ifdef NON_USER_COPY
+ VISEntryHalfFast(.Lmedium_vis_entry_fail)
+#else
+ VISEntryHalf
+#endif
andn %o2, 0x3f, %o4
sub %o2, %o4, %o2
- VISEntryHalf
alignaddr %o1, %g0, %g1
add %o1, %o4, %o1
EX_LD(LOAD(ldd, %g1 + 0x00, %f0))
@@ -240,6 +248,10 @@ FUNC_NAME: /* %o0=dst, %o1=src, %o2=len
nop
ba,a,pt %icc, .Lmedium_unaligned

+#ifdef NON_USER_COPY
+.Lmedium_vis_entry_fail:
+ or %o0, %o1, %g2
+#endif
.Lmedium:
LOAD(prefetch, %o1 + 0x40, #n_reads_strong)
andcc %g2, 0x7, %g0

2014-10-28 03:41:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 146/146] sparc64: Implement __get_user_pages_fast().

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit 06090e8ed89ea2113a236befb41f71d51f100e60 ]

It is not sufficient to only implement get_user_pages_fast(), you
must also implement the atomic version __get_user_pages_fast()
otherwise you end up using the weak symbol fallback implementation
which simply returns zero.

This is dangerous, because it causes the futex code to loop forever
if transparent hugepages are supported (see get_futex_key()).

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/mm/gup.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

--- a/arch/sparc/mm/gup.c
+++ b/arch/sparc/mm/gup.c
@@ -160,6 +160,36 @@ static int gup_pud_range(pgd_t pgd, unsi
return 1;
}

+int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
+ struct page **pages)
+{
+ struct mm_struct *mm = current->mm;
+ unsigned long addr, len, end;
+ unsigned long next, flags;
+ pgd_t *pgdp;
+ int nr = 0;
+
+ start &= PAGE_MASK;
+ addr = start;
+ len = (unsigned long) nr_pages << PAGE_SHIFT;
+ end = start + len;
+
+ local_irq_save(flags);
+ pgdp = pgd_offset(mm, addr);
+ do {
+ pgd_t pgd = *pgdp;
+
+ next = pgd_addr_end(addr, end);
+ if (pgd_none(pgd))
+ break;
+ if (!gup_pud_range(pgd, addr, next, write, pages, &nr))
+ break;
+ } while (pgdp++, addr = next, addr != end);
+ local_irq_restore(flags);
+
+ return nr;
+}
+
int get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
{

2014-10-28 03:42:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 132/146] sparc64: support M6 and M7 for building CPU distribution map

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Allen Pais <[email protected]>

Add M6 and M7 chip type in cpumap.c to correctly build CPU distribution map that spans all online CPUs.

Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/cpumap.c | 2 ++
1 file changed, 2 insertions(+)

--- a/arch/sparc/kernel/cpumap.c
+++ b/arch/sparc/kernel/cpumap.c
@@ -326,6 +326,8 @@ static int iterate_cpu(struct cpuinfo_tr
case SUN4V_CHIP_NIAGARA3:
case SUN4V_CHIP_NIAGARA4:
case SUN4V_CHIP_NIAGARA5:
+ case SUN4V_CHIP_SPARC_M6:
+ case SUN4V_CHIP_SPARC_M7:
case SUN4V_CHIP_SPARC64X:
rover_inc_table = niagara_iterate_method;
break;

2014-10-28 03:42:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 134/146] sparc64: T5 PMU

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: bob picco <[email protected]>

The T5 (niagara5) has different PCR related HV fast trap values and a new
HV API Group. This patch utilizes these and shares when possible with niagara4.

We use the same sparc_pmu niagara4_pmu. Should there be new effort to
obtain the MCU perf statistics then this would have to be changed.

Cc: [email protected]
Signed-off-by: Bob Picco <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/hypervisor.h | 11 ++++++++
arch/sparc/kernel/hvapi.c | 1
arch/sparc/kernel/hvcalls.S | 16 ++++++++++++
arch/sparc/kernel/pcr.c | 47 ++++++++++++++++++++++++++++++++----
arch/sparc/kernel/perf_event.c | 3 +-
5 files changed, 73 insertions(+), 5 deletions(-)

--- a/arch/sparc/include/asm/hypervisor.h
+++ b/arch/sparc/include/asm/hypervisor.h
@@ -2947,6 +2947,16 @@ unsigned long sun4v_vt_set_perfreg(unsig
unsigned long reg_val);
#endif

+#define HV_FAST_T5_GET_PERFREG 0x1a8
+#define HV_FAST_T5_SET_PERFREG 0x1a9
+
+#ifndef __ASSEMBLY__
+unsigned long sun4v_t5_get_perfreg(unsigned long reg_num,
+ unsigned long *reg_val);
+unsigned long sun4v_t5_set_perfreg(unsigned long reg_num,
+ unsigned long reg_val);
+#endif
+
/* Function numbers for HV_CORE_TRAP. */
#define HV_CORE_SET_VER 0x00
#define HV_CORE_PUTCHAR 0x01
@@ -2978,6 +2988,7 @@ unsigned long sun4v_vt_set_perfreg(unsig
#define HV_GRP_VF_CPU 0x0205
#define HV_GRP_KT_CPU 0x0209
#define HV_GRP_VT_CPU 0x020c
+#define HV_GRP_T5_CPU 0x0211
#define HV_GRP_DIAG 0x0300

#ifndef __ASSEMBLY__
--- a/arch/sparc/kernel/hvapi.c
+++ b/arch/sparc/kernel/hvapi.c
@@ -46,6 +46,7 @@ static struct api_info api_table[] = {
{ .group = HV_GRP_VF_CPU, },
{ .group = HV_GRP_KT_CPU, },
{ .group = HV_GRP_VT_CPU, },
+ { .group = HV_GRP_T5_CPU, },
{ .group = HV_GRP_DIAG, .flags = FLAG_PRE_API },
};

--- a/arch/sparc/kernel/hvcalls.S
+++ b/arch/sparc/kernel/hvcalls.S
@@ -821,3 +821,19 @@ ENTRY(sun4v_vt_set_perfreg)
retl
nop
ENDPROC(sun4v_vt_set_perfreg)
+
+ENTRY(sun4v_t5_get_perfreg)
+ mov %o1, %o4
+ mov HV_FAST_T5_GET_PERFREG, %o5
+ ta HV_FAST_TRAP
+ stx %o1, [%o4]
+ retl
+ nop
+ENDPROC(sun4v_t5_get_perfreg)
+
+ENTRY(sun4v_t5_set_perfreg)
+ mov HV_FAST_T5_SET_PERFREG, %o5
+ ta HV_FAST_TRAP
+ retl
+ nop
+ENDPROC(sun4v_t5_set_perfreg)
--- a/arch/sparc/kernel/pcr.c
+++ b/arch/sparc/kernel/pcr.c
@@ -191,12 +191,41 @@ static const struct pcr_ops n4_pcr_ops =
.pcr_nmi_disable = PCR_N4_PICNPT,
};

+static u64 n5_pcr_read(unsigned long reg_num)
+{
+ unsigned long val;
+
+ (void) sun4v_t5_get_perfreg(reg_num, &val);
+
+ return val;
+}
+
+static void n5_pcr_write(unsigned long reg_num, u64 val)
+{
+ (void) sun4v_t5_set_perfreg(reg_num, val);
+}
+
+static const struct pcr_ops n5_pcr_ops = {
+ .read_pcr = n5_pcr_read,
+ .write_pcr = n5_pcr_write,
+ .read_pic = n4_pic_read,
+ .write_pic = n4_pic_write,
+ .nmi_picl_value = n4_picl_value,
+ .pcr_nmi_enable = (PCR_N4_PICNPT | PCR_N4_STRACE |
+ PCR_N4_UTRACE | PCR_N4_TOE |
+ (26 << PCR_N4_SL_SHIFT)),
+ .pcr_nmi_disable = PCR_N4_PICNPT,
+};
+
+
static unsigned long perf_hsvc_group;
static unsigned long perf_hsvc_major;
static unsigned long perf_hsvc_minor;

static int __init register_perf_hsvc(void)
{
+ unsigned long hverror;
+
if (tlb_type == hypervisor) {
switch (sun4v_chip_type) {
case SUN4V_CHIP_NIAGARA1:
@@ -215,6 +244,10 @@ static int __init register_perf_hsvc(voi
perf_hsvc_group = HV_GRP_VT_CPU;
break;

+ case SUN4V_CHIP_NIAGARA5:
+ perf_hsvc_group = HV_GRP_T5_CPU;
+ break;
+
default:
return -ENODEV;
}
@@ -222,10 +255,12 @@ static int __init register_perf_hsvc(voi

perf_hsvc_major = 1;
perf_hsvc_minor = 0;
- if (sun4v_hvapi_register(perf_hsvc_group,
- perf_hsvc_major,
- &perf_hsvc_minor)) {
- printk("perfmon: Could not register hvapi.\n");
+ hverror = sun4v_hvapi_register(perf_hsvc_group,
+ perf_hsvc_major,
+ &perf_hsvc_minor);
+ if (hverror) {
+ pr_err("perfmon: Could not register hvapi(0x%lx).\n",
+ hverror);
return -ENODEV;
}
}
@@ -254,6 +289,10 @@ static int __init setup_sun4v_pcr_ops(vo
pcr_ops = &n4_pcr_ops;
break;

+ case SUN4V_CHIP_NIAGARA5:
+ pcr_ops = &n5_pcr_ops;
+ break;
+
default:
ret = -ENODEV;
break;
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1662,7 +1662,8 @@ static bool __init supported_pmu(void)
sparc_pmu = &niagara2_pmu;
return true;
}
- if (!strcmp(sparc_pmu_type, "niagara4")) {
+ if (!strcmp(sparc_pmu_type, "niagara4") ||
+ !strcmp(sparc_pmu_type, "niagara5")) {
sparc_pmu = &niagara4_pmu;
return true;
}

2014-10-28 07:29:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 126/146] sparc64: Fix reversed start/end in flush_tlb_kernel_range()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit 473ad7f4fb005d1bb727e4ef27d370d28703a062 ]

When we have to split up a flush request into multiple pieces
(in order to avoid the firmware range) we don't specify the
arguments in the right order for the second piece.

Fix the order, or else we get hangs as the code tries to
flush "a lot" of entries and we get lockups like this:

[ 4422.981276] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [expect:117032]
[ 4422.996130] Modules linked in: ipv6 loop usb_storage igb ptp sg sr_mod ehci_pci ehci_hcd pps_core n2_rng rng_core
[ 4423.016617] CPU: 12 PID: 117032 Comm: expect Not tainted 3.17.0-rc4+ #1608
[ 4423.030331] task: fff8003cc730e220 ti: fff8003d99d54000 task.ti: fff8003d99d54000
[ 4423.045282] TSTATE: 0000000011001602 TPC: 00000000004521e8 TNPC: 00000000004521ec Y: 00000000 Not tainted
[ 4423.064905] TPC: <__flush_tlb_kernel_range+0x28/0x40>
[ 4423.074964] g0: 000000000052fd10 g1: 00000001295a8000 g2: ffffff7176ffc000 g3: 0000000000002000
[ 4423.092324] g4: fff8003cc730e220 g5: fff8003dfedcc000 g6: fff8003d99d54000 g7: 0000000000000006
[ 4423.109687] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000000000003 o3: 00000000f0000000
[ 4423.127058] o4: 0000000000000080 o5: 00000001295a8000 sp: fff8003d99d56d01 ret_pc: 000000000052ff54
[ 4423.145121] RPC: <__purge_vmap_area_lazy+0x314/0x3a0>
[ 4423.155185] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000a38040 l3: 0000000000000000
[ 4423.172559] l4: fff8003dae8965e0 l5: ffffffffffffffff l6: 0000000000000000 l7: 00000000f7e2b138
[ 4423.189913] i0: fff8003d99d576a0 i1: fff8003d99d576a8 i2: fff8003d99d575e8 i3: 0000000000000000
[ 4423.207284] i4: 0000000000008008 i5: fff8003d99d575c8 i6: fff8003d99d56df1 i7: 0000000000530c24
[ 4423.224640] I7: <free_vmap_area_noflush+0x64/0x80>
[ 4423.234193] Call Trace:
[ 4423.239051] [0000000000530c24] free_vmap_area_noflush+0x64/0x80
[ 4423.251029] [0000000000531a7c] remove_vm_area+0x5c/0x80
[ 4423.261628] [0000000000531b80] __vunmap+0x20/0x120
[ 4423.271352] [000000000071cf18] n_tty_close+0x18/0x40
[ 4423.281423] [00000000007222b0] tty_ldisc_close+0x30/0x60
[ 4423.292183] [00000000007225a4] tty_ldisc_reinit+0x24/0xa0
[ 4423.303120] [0000000000722ab4] tty_ldisc_hangup+0xd4/0x1e0
[ 4423.314232] [0000000000719aa0] __tty_hangup+0x280/0x3c0
[ 4423.324835] [0000000000724cb4] pty_close+0x134/0x1a0
[ 4423.334905] [000000000071aa24] tty_release+0x104/0x500
[ 4423.345316] [00000000005511d0] __fput+0x90/0x1e0
[ 4423.354701] [000000000047fa54] task_work_run+0x94/0xe0
[ 4423.365126] [0000000000404b44] __handle_signal+0xc/0x2c

Fixes: 4ca9a23765da ("sparc64: Guard against flushing openfirmware mappings.")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/mm/init_64.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2790,8 +2790,8 @@ void flush_tlb_kernel_range(unsigned lon
do_flush_tlb_kernel_range(start, LOW_OBP_ADDRESS);
}
if (end > HI_OBP_ADDRESS) {
- flush_tsb_kernel_range(end, HI_OBP_ADDRESS);
- do_flush_tlb_kernel_range(end, HI_OBP_ADDRESS);
+ flush_tsb_kernel_range(HI_OBP_ADDRESS, end);
+ do_flush_tlb_kernel_range(HI_OBP_ADDRESS, end);
}
} else {
flush_tsb_kernel_range(start, end);

2014-10-28 07:30:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 135/146] sparc64: Switch to 4-level page tables.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit ac55c768143aa34cc3789c4820cbb0809a76fd9c ]

This has become necessary with chips that support more than 43-bits
of physical addressing.

Based almost entirely upon a patch by Bob Picco.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/page_64.h | 6 +++++
arch/sparc/include/asm/pgalloc_64.h | 28 ++++++++++++++++++++++++++-
arch/sparc/include/asm/pgtable_64.h | 37 +++++++++++++++++++++++++++++++-----
arch/sparc/include/asm/tsb.h | 10 +++++++++
arch/sparc/kernel/smp_64.c | 7 ++++++
arch/sparc/mm/init_64.c | 31 ++++++++++++++++++++++++++----
6 files changed, 109 insertions(+), 10 deletions(-)

--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -57,18 +57,21 @@ void copy_user_page(void *to, void *from
typedef struct { unsigned long pte; } pte_t;
typedef struct { unsigned long iopte; } iopte_t;
typedef struct { unsigned long pmd; } pmd_t;
+typedef struct { unsigned long pud; } pud_t;
typedef struct { unsigned long pgd; } pgd_t;
typedef struct { unsigned long pgprot; } pgprot_t;

#define pte_val(x) ((x).pte)
#define iopte_val(x) ((x).iopte)
#define pmd_val(x) ((x).pmd)
+#define pud_val(x) ((x).pud)
#define pgd_val(x) ((x).pgd)
#define pgprot_val(x) ((x).pgprot)

#define __pte(x) ((pte_t) { (x) } )
#define __iopte(x) ((iopte_t) { (x) } )
#define __pmd(x) ((pmd_t) { (x) } )
+#define __pud(x) ((pud_t) { (x) } )
#define __pgd(x) ((pgd_t) { (x) } )
#define __pgprot(x) ((pgprot_t) { (x) } )

@@ -77,18 +80,21 @@ typedef struct { unsigned long pgprot; }
typedef unsigned long pte_t;
typedef unsigned long iopte_t;
typedef unsigned long pmd_t;
+typedef unsigned long pud_t;
typedef unsigned long pgd_t;
typedef unsigned long pgprot_t;

#define pte_val(x) (x)
#define iopte_val(x) (x)
#define pmd_val(x) (x)
+#define pud_val(x) (x)
#define pgd_val(x) (x)
#define pgprot_val(x) (x)

#define __pte(x) (x)
#define __iopte(x) (x)
#define __pmd(x) (x)
+#define __pud(x) (x)
#define __pgd(x) (x)
#define __pgprot(x) (x)

--- a/arch/sparc/include/asm/pgalloc_64.h
+++ b/arch/sparc/include/asm/pgalloc_64.h
@@ -15,6 +15,13 @@

extern struct kmem_cache *pgtable_cache;

+static inline void __pgd_populate(pgd_t *pgd, pud_t *pud)
+{
+ pgd_set(pgd, pud);
+}
+
+#define pgd_populate(MM, PGD, PUD) __pgd_populate(PGD, PUD)
+
static inline pgd_t *pgd_alloc(struct mm_struct *mm)
{
return kmem_cache_alloc(pgtable_cache, GFP_KERNEL);
@@ -25,7 +32,23 @@ static inline void pgd_free(struct mm_st
kmem_cache_free(pgtable_cache, pgd);
}

-#define pud_populate(MM, PUD, PMD) pud_set(PUD, PMD)
+static inline void __pud_populate(pud_t *pud, pmd_t *pmd)
+{
+ pud_set(pud, pmd);
+}
+
+#define pud_populate(MM, PUD, PMD) __pud_populate(PUD, PMD)
+
+static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+ return kmem_cache_alloc(pgtable_cache,
+ GFP_KERNEL|__GFP_REPEAT);
+}
+
+static inline void pud_free(struct mm_struct *mm, pud_t *pud)
+{
+ kmem_cache_free(pgtable_cache, pud);
+}

static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
{
@@ -91,4 +114,7 @@ static inline void __pte_free_tlb(struct
#define __pmd_free_tlb(tlb, pmd, addr) \
pgtable_free_tlb(tlb, pmd, false)

+#define __pud_free_tlb(tlb, pud, addr) \
+ pgtable_free_tlb(tlb, pud, false)
+
#endif /* _SPARC64_PGALLOC_H */
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -20,8 +20,6 @@
#include <asm/page.h>
#include <asm/processor.h>

-#include <asm-generic/pgtable-nopud.h>
-
/* The kernel image occupies 0x4000000 to 0x6000000 (4MB --> 96MB).
* The page copy blockops can use 0x6000000 to 0x8000000.
* The 8K TSB is mapped in the 0x8000000 to 0x8400000 range.
@@ -55,13 +53,21 @@
#define PMD_MASK (~(PMD_SIZE-1))
#define PMD_BITS (PAGE_SHIFT - 3)

-/* PGDIR_SHIFT determines what a third-level page table entry can map */
-#define PGDIR_SHIFT (PAGE_SHIFT + (PAGE_SHIFT-3) + PMD_BITS)
+/* PUD_SHIFT determines the size of the area a third-level page
+ * table can map
+ */
+#define PUD_SHIFT (PMD_SHIFT + PMD_BITS)
+#define PUD_SIZE (_AC(1,UL) << PUD_SHIFT)
+#define PUD_MASK (~(PUD_SIZE-1))
+#define PUD_BITS (PAGE_SHIFT - 3)
+
+/* PGDIR_SHIFT determines what a fourth-level page table entry can map */
+#define PGDIR_SHIFT (PUD_SHIFT + PUD_BITS)
#define PGDIR_SIZE (_AC(1,UL) << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE-1))
#define PGDIR_BITS (PAGE_SHIFT - 3)

-#if (PGDIR_SHIFT + PGDIR_BITS) != 43
+#if (PGDIR_SHIFT + PGDIR_BITS) != 53
#error Page table parameters do not cover virtual address space properly.
#endif

@@ -93,6 +99,7 @@ static inline bool kern_addr_valid(unsig
/* Entries per page directory level. */
#define PTRS_PER_PTE (1UL << (PAGE_SHIFT-3))
#define PTRS_PER_PMD (1UL << PMD_BITS)
+#define PTRS_PER_PUD (1UL << PUD_BITS)
#define PTRS_PER_PGD (1UL << PGDIR_BITS)

/* Kernel has a separate 44bit address space. */
@@ -101,6 +108,9 @@ static inline bool kern_addr_valid(unsig
#define pmd_ERROR(e) \
pr_err("%s:%d: bad pmd %p(%016lx) seen at (%pS)\n", \
__FILE__, __LINE__, &(e), pmd_val(e), __builtin_return_address(0))
+#define pud_ERROR(e) \
+ pr_err("%s:%d: bad pud %p(%016lx) seen at (%pS)\n", \
+ __FILE__, __LINE__, &(e), pud_val(e), __builtin_return_address(0))
#define pgd_ERROR(e) \
pr_err("%s:%d: bad pgd %p(%016lx) seen at (%pS)\n", \
__FILE__, __LINE__, &(e), pgd_val(e), __builtin_return_address(0))
@@ -779,6 +789,11 @@ static inline int pmd_present(pmd_t pmd)
#define pud_bad(pud) ((pud_val(pud) & ~PAGE_MASK) || \
!__kern_addr_valid(pud_val(pud)))

+#define pgd_none(pgd) (!pgd_val(pgd))
+
+#define pgd_bad(pgd) ((pgd_val(pgd) & ~PAGE_MASK) || \
+ !__kern_addr_valid(pgd_val(pgd)))
+
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd);
@@ -815,10 +830,17 @@ static inline unsigned long __pmd_page(p
#define pmd_clear(pmdp) (pmd_val(*(pmdp)) = 0UL)
#define pud_present(pud) (pud_val(pud) != 0U)
#define pud_clear(pudp) (pud_val(*(pudp)) = 0UL)
+#define pgd_page_vaddr(pgd) \
+ ((unsigned long) __va(pgd_val(pgd)))
+#define pgd_present(pgd) (pgd_val(pgd) != 0U)
+#define pgd_clear(pgdp) (pgd_val(*(pgd)) = 0UL)

/* Same in both SUN4V and SUN4U. */
#define pte_none(pte) (!pte_val(pte))

+#define pgd_set(pgdp, pudp) \
+ (pgd_val(*(pgdp)) = (__pa((unsigned long) (pudp))))
+
/* to find an entry in a page-table-directory. */
#define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
#define pgd_offset(mm, address) ((mm)->pgd + pgd_index(address))
@@ -826,6 +848,11 @@ static inline unsigned long __pmd_page(p
/* to find an entry in a kernel page-table-directory */
#define pgd_offset_k(address) pgd_offset(&init_mm, address)

+/* Find an entry in the third-level page table.. */
+#define pud_index(address) (((address) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
+#define pud_offset(pgdp, address) \
+ ((pud_t *) pgd_page_vaddr(*(pgdp)) + pud_index(address))
+
/* Find an entry in the second-level page table.. */
#define pmd_offset(pudp, address) \
((pmd_t *) pud_page_vaddr(*(pudp)) + \
--- a/arch/sparc/include/asm/tsb.h
+++ b/arch/sparc/include/asm/tsb.h
@@ -145,6 +145,11 @@ extern struct tsb_phys_patch_entry __tsb
andn REG2, 0x7, REG2; \
ldx [REG1 + REG2], REG1; \
brz,pn REG1, FAIL_LABEL; \
+ sllx VADDR, 64 - (PUD_SHIFT + PUD_BITS), REG2; \
+ srlx REG2, 64 - PAGE_SHIFT, REG2; \
+ andn REG2, 0x7, REG2; \
+ ldxa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \
+ brz,pn REG1, FAIL_LABEL; \
sllx VADDR, 64 - (PMD_SHIFT + PMD_BITS), REG2; \
srlx REG2, 64 - PAGE_SHIFT, REG2; \
andn REG2, 0x7, REG2; \
@@ -198,6 +203,11 @@ extern struct tsb_phys_patch_entry __tsb
andn REG2, 0x7, REG2; \
ldxa [PHYS_PGD + REG2] ASI_PHYS_USE_EC, REG1; \
brz,pn REG1, FAIL_LABEL; \
+ sllx VADDR, 64 - (PUD_SHIFT + PUD_BITS), REG2; \
+ srlx REG2, 64 - PAGE_SHIFT, REG2; \
+ andn REG2, 0x7, REG2; \
+ ldxa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \
+ brz,pn REG1, FAIL_LABEL; \
sllx VADDR, 64 - (PMD_SHIFT + PMD_BITS), REG2; \
srlx REG2, 64 - PAGE_SHIFT, REG2; \
andn REG2, 0x7, REG2; \
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1467,6 +1467,13 @@ static void __init pcpu_populate_pte(uns
pud_t *pud;
pmd_t *pmd;

+ if (pgd_none(*pgd)) {
+ pud_t *new;
+
+ new = __alloc_bootmem(PAGE_SIZE, PAGE_SIZE, PAGE_SIZE);
+ pgd_populate(&init_mm, pgd, new);
+ }
+
pud = pud_offset(pgd, addr);
if (pud_none(*pud)) {
pmd_t *new;
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -1390,6 +1390,13 @@ static unsigned long __ref kernel_map_ra
pmd_t *pmd;
pte_t *pte;

+ if (pgd_none(*pgd)) {
+ pud_t *new;
+
+ new = __alloc_bootmem(PAGE_SIZE, PAGE_SIZE, PAGE_SIZE);
+ alloc_bytes += PAGE_SIZE;
+ pgd_populate(&init_mm, pgd, new);
+ }
pud = pud_offset(pgd, vstart);
if (pud_none(*pud)) {
pmd_t *new;
@@ -1856,7 +1863,12 @@ static void __init sun4v_linear_pte_xor_
/* paging_init() sets up the page tables */

static unsigned long last_valid_pfn;
-pgd_t swapper_pg_dir[PTRS_PER_PGD];
+
+/* These must be page aligned in order to not trigger the
+ * alignment tests of pgd_bad() and pud_bad().
+ */
+pgd_t swapper_pg_dir[PTRS_PER_PGD] __attribute__ ((aligned (PAGE_SIZE)));
+static pud_t swapper_pud_dir[PTRS_PER_PUD] __attribute__ ((aligned (PAGE_SIZE)));

static void sun4u_pgprot_init(void);
static void sun4v_pgprot_init(void);
@@ -1865,6 +1877,8 @@ void __init paging_init(void)
{
unsigned long end_pfn, shift, phys_base;
unsigned long real_end, i;
+ pud_t *pud;
+ pmd_t *pmd;
int node;

setup_page_offset();
@@ -1961,9 +1975,18 @@ void __init paging_init(void)

memset(swapper_low_pmd_dir, 0, sizeof(swapper_low_pmd_dir));

- /* Now can init the kernel/bad page tables. */
- pud_set(pud_offset(&swapper_pg_dir[0], 0),
- swapper_low_pmd_dir + (shift / sizeof(pgd_t)));
+ /* The kernel page tables we publish into what the rest of the
+ * world sees must be adjusted so that they see the PAGE_OFFSET
+ * address of these in-kerenel data structures. However right
+ * here we must access them from the kernel image side, because
+ * the trap tables haven't been taken over and therefore we cannot
+ * take TLB misses in the PAGE_OFFSET linear mappings yet.
+ */
+ pud = swapper_pud_dir + (shift / sizeof(pud_t));
+ pgd_set(&swapper_pg_dir[0], pud);
+
+ pmd = swapper_low_pmd_dir + (shift / sizeof(pmd_t));
+ pud_set(&swapper_pud_dir[0], pmd);

inherit_prom_mappings();


2014-10-28 07:30:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 133/146] sparc64: cpu hardware caps support for sparc M6 and M7

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Allen Pais <[email protected]>

Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/setup_64.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/arch/sparc/kernel/setup_64.c
+++ b/arch/sparc/kernel/setup_64.c
@@ -500,12 +500,16 @@ static void __init init_sparc64_elf_hwca
sun4v_chip_type == SUN4V_CHIP_NIAGARA3 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA4 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA5 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M6 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M7 ||
sun4v_chip_type == SUN4V_CHIP_SPARC64X)
cap |= HWCAP_SPARC_BLKINIT;
if (sun4v_chip_type == SUN4V_CHIP_NIAGARA2 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA3 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA4 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA5 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M6 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M7 ||
sun4v_chip_type == SUN4V_CHIP_SPARC64X)
cap |= HWCAP_SPARC_N2;
}
@@ -533,6 +537,8 @@ static void __init init_sparc64_elf_hwca
sun4v_chip_type == SUN4V_CHIP_NIAGARA3 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA4 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA5 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M6 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M7 ||
sun4v_chip_type == SUN4V_CHIP_SPARC64X)
cap |= (AV_SPARC_VIS | AV_SPARC_VIS2 |
AV_SPARC_ASI_BLK_INIT |
@@ -540,6 +546,8 @@ static void __init init_sparc64_elf_hwca
if (sun4v_chip_type == SUN4V_CHIP_NIAGARA3 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA4 ||
sun4v_chip_type == SUN4V_CHIP_NIAGARA5 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M6 ||
+ sun4v_chip_type == SUN4V_CHIP_SPARC_M7 ||
sun4v_chip_type == SUN4V_CHIP_SPARC64X)
cap |= (AV_SPARC_VIS3 | AV_SPARC_HPC |
AV_SPARC_FMAF);

2014-10-28 03:41:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 129/146] sparc64: Do not define thread fpregs save area as zero-length array.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit e2653143d7d79a49f1a961aeae1d82612838b12c ]

This breaks the stack end corruption detection facility.

What that facility does it write a magic value to "end_of_stack()"
and checking to see if it gets overwritten.

"end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is
the beginning of the FPU register save area.

So once the user uses the FPU, the magic value is overwritten and the
debug checks trigger.

Fix this by making the size explicit.

Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we
are limited to 7 levels of FPU state saves. So each FPU register set
is 256 bytes, allocate 256 * 7 for the fpregs area.

Reported-by: Meelis Roos <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/thread_info_64.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/sparc/include/asm/thread_info_64.h
+++ b/arch/sparc/include/asm/thread_info_64.h
@@ -63,7 +63,8 @@ struct thread_info {
struct pt_regs *kern_una_regs;
unsigned int kern_una_insn;

- unsigned long fpregs[0] __attribute__ ((aligned(64)));
+ unsigned long fpregs[(7 * 256) / sizeof(unsigned long)]
+ __attribute__ ((aligned(64)));
};

#endif /* !(__ASSEMBLY__) */

2014-10-28 07:31:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 131/146] sparc64: correctly recognise M6 and M7 cpu type

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Allen Pais <[email protected]>

The following patch adds support for correctly
recognising M6 and M7 cpu type.

Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/spitfire.h | 2 ++
arch/sparc/kernel/cpu.c | 12 ++++++++++++
arch/sparc/kernel/head_64.S | 12 ++++++++++++
3 files changed, 26 insertions(+)

--- a/arch/sparc/include/asm/spitfire.h
+++ b/arch/sparc/include/asm/spitfire.h
@@ -45,6 +45,8 @@
#define SUN4V_CHIP_NIAGARA3 0x03
#define SUN4V_CHIP_NIAGARA4 0x04
#define SUN4V_CHIP_NIAGARA5 0x05
+#define SUN4V_CHIP_SPARC_M6 0x06
+#define SUN4V_CHIP_SPARC_M7 0x07
#define SUN4V_CHIP_SPARC64X 0x8a
#define SUN4V_CHIP_UNKNOWN 0xff

--- a/arch/sparc/kernel/cpu.c
+++ b/arch/sparc/kernel/cpu.c
@@ -494,6 +494,18 @@ static void __init sun4v_cpu_probe(void)
sparc_pmu_type = "niagara5";
break;

+ case SUN4V_CHIP_SPARC_M6:
+ sparc_cpu_type = "SPARC-M6";
+ sparc_fpu_type = "SPARC-M6 integrated FPU";
+ sparc_pmu_type = "sparc-m6";
+ break;
+
+ case SUN4V_CHIP_SPARC_M7:
+ sparc_cpu_type = "SPARC-M7";
+ sparc_fpu_type = "SPARC-M7 integrated FPU";
+ sparc_pmu_type = "sparc-m7";
+ break;
+
case SUN4V_CHIP_SPARC64X:
sparc_cpu_type = "SPARC64-X";
sparc_fpu_type = "SPARC64-X integrated FPU";
--- a/arch/sparc/kernel/head_64.S
+++ b/arch/sparc/kernel/head_64.S
@@ -427,6 +427,12 @@ sun4v_chip_type:
cmp %g2, '5'
be,pt %xcc, 5f
mov SUN4V_CHIP_NIAGARA5, %g4
+ cmp %g2, '6'
+ be,pt %xcc, 5f
+ mov SUN4V_CHIP_SPARC_M6, %g4
+ cmp %g2, '7'
+ be,pt %xcc, 5f
+ mov SUN4V_CHIP_SPARC_M7, %g4
ba,pt %xcc, 49f
nop

@@ -585,6 +591,12 @@ niagara_tlb_fixup:
cmp %g1, SUN4V_CHIP_NIAGARA5
be,pt %xcc, niagara4_patch
nop
+ cmp %g1, SUN4V_CHIP_SPARC_M6
+ be,pt %xcc, niagara4_patch
+ nop
+ cmp %g1, SUN4V_CHIP_SPARC_M7
+ be,pt %xcc, niagara4_patch
+ nop

call generic_patch_copyops
nop

2014-10-28 07:31:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 130/146] sparc64: Fix hibernation code refrence to PAGE_OFFSET.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

We changed PAGE_OFFSET to be a variable rather than a constant,
but this reference here in the hibernate assembler got missed.

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/power/hibernate_asm.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/sparc/power/hibernate_asm.S
+++ b/arch/sparc/power/hibernate_asm.S
@@ -54,8 +54,8 @@ ENTRY(swsusp_arch_resume)
nop

/* Write PAGE_OFFSET to %g7 */
- sethi %uhi(PAGE_OFFSET), %g7
- sllx %g7, 32, %g7
+ sethi %hi(PAGE_OFFSET), %g7
+ ldx [%g7 + %lo(PAGE_OFFSET)], %g7

setuw (PAGE_SIZE-8), %g3


2014-10-28 07:32:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 145/146] sparc64: Fix register corruption in top-most kernel stack frame during boot.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit ef3e035c3a9b81da8a778bc333d10637acf6c199 ]

Meelis Roos reported that kernels built with gcc-4.9 do not boot, we
eventually narrowed this down to only impacting machines using
UltraSPARC-III and derivitive cpus.

The crash happens right when the first user process is spawned:

[ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 54.451346]
[ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab #96
[ 54.666431] Call Trace:
[ 54.698453] [0000000000762f8c] panic+0xb0/0x224
[ 54.759071] [000000000045cf68] do_exit+0x948/0x960
[ 54.823123] [000000000042cbc0] fault_in_user_windows+0xe0/0x100
[ 54.902036] [0000000000404ad0] __handle_user_windows+0x0/0x10
[ 54.978662] Press Stop-A (L1-A) to return to the boot prom
[ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004

Further investigation showed that compiling only per_cpu_patch() with
an older compiler fixes the boot.

Detailed analysis showed that the function is not being miscompiled by
gcc-4.9, but it is using a different register allocation ordering.

With the gcc-4.9 compiled function, something during the code patching
causes some of the %i* input registers to get corrupted. Perhaps
we have a TLB miss path into the firmware that is deep enough to
cause a register window spill and subsequent restore when we get
back from the TLB miss trap.

Let's plug this up by doing two things:

1) Stop using the firmware stack for client interface calls into
the firmware. Just use the kernel's stack.

2) As soon as we can, call into a new function "start_early_boot()"
to put a one-register-window buffer between the firmware's
deepest stack frame and the top-most initial kernel one.

Reported-by: Meelis Roos <[email protected]>
Tested-by: Meelis Roos <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/oplib_64.h | 3 +-
arch/sparc/include/asm/setup.h | 2 +
arch/sparc/kernel/entry.h | 3 --
arch/sparc/kernel/head_64.S | 40 +++-----------------------------------
arch/sparc/kernel/hvtramp.S | 1
arch/sparc/kernel/setup_64.c | 28 +++++++++++++++++++-------
arch/sparc/kernel/trampoline_64.S | 12 ++++++-----
arch/sparc/prom/cif.S | 5 +---
arch/sparc/prom/init_64.c | 6 ++---
arch/sparc/prom/p1275.c | 2 -
10 files changed, 40 insertions(+), 62 deletions(-)

--- a/arch/sparc/include/asm/oplib_64.h
+++ b/arch/sparc/include/asm/oplib_64.h
@@ -62,7 +62,8 @@ struct linux_mem_p1275 {
/* You must call prom_init() before using any of the library services,
* preferably as early as possible. Pass it the romvec pointer.
*/
-void prom_init(void *cif_handler, void *cif_stack);
+void prom_init(void *cif_handler);
+void prom_init_report(void);

/* Boot argument acquisition, returns the boot command line string. */
char *prom_getbootargs(void);
--- a/arch/sparc/include/asm/setup.h
+++ b/arch/sparc/include/asm/setup.h
@@ -48,6 +48,8 @@ unsigned long safe_compute_effective_add
#endif

#ifdef CONFIG_SPARC64
+void __init start_early_boot(void);
+
/* unaligned_64.c */
int handle_ldf_stq(u32 insn, struct pt_regs *regs);
void handle_ld_nf(u32 insn, struct pt_regs *regs);
--- a/arch/sparc/kernel/entry.h
+++ b/arch/sparc/kernel/entry.h
@@ -65,13 +65,10 @@ struct pause_patch_entry {
extern struct pause_patch_entry __pause_3insn_patch,
__pause_3insn_patch_end;

-void __init per_cpu_patch(void);
void sun4v_patch_1insn_range(struct sun4v_1insn_patch_entry *,
struct sun4v_1insn_patch_entry *);
void sun4v_patch_2insn_range(struct sun4v_2insn_patch_entry *,
struct sun4v_2insn_patch_entry *);
-void __init sun4v_patch(void);
-void __init boot_cpu_id_too_large(int cpu);
extern unsigned int dcache_parity_tl1_occurred;
extern unsigned int icache_parity_tl1_occurred;

--- a/arch/sparc/kernel/head_64.S
+++ b/arch/sparc/kernel/head_64.S
@@ -672,14 +672,12 @@ tlb_fixup_done:
sethi %hi(init_thread_union), %g6
or %g6, %lo(init_thread_union), %g6
ldx [%g6 + TI_TASK], %g4
- mov %sp, %l6

wr %g0, ASI_P, %asi
mov 1, %g1
sllx %g1, THREAD_SHIFT, %g1
sub %g1, (STACKFRAME_SZ + STACK_BIAS), %g1
add %g6, %g1, %sp
- mov 0, %fp

/* Set per-cpu pointer initially to zero, this makes
* the boot-cpu use the in-kernel-image per-cpu areas
@@ -706,44 +704,14 @@ tlb_fixup_done:
nop
#endif

- mov %l6, %o1 ! OpenPROM stack
call prom_init
mov %l7, %o0 ! OpenPROM cif handler

- /* Initialize current_thread_info()->cpu as early as possible.
- * In order to do that accurately we have to patch up the get_cpuid()
- * assembler sequences. And that, in turn, requires that we know
- * if we are on a Starfire box or not. While we're here, patch up
- * the sun4v sequences as well.
+ /* To create a one-register-window buffer between the kernel's
+ * initial stack and the last stack frame we use from the firmware,
+ * do the rest of the boot from a C helper function.
*/
- call check_if_starfire
- nop
- call per_cpu_patch
- nop
- call sun4v_patch
- nop
-
-#ifdef CONFIG_SMP
- call hard_smp_processor_id
- nop
- cmp %o0, NR_CPUS
- blu,pt %xcc, 1f
- nop
- call boot_cpu_id_too_large
- nop
- /* Not reached... */
-
-1:
-#else
- mov 0, %o0
-#endif
- sth %o0, [%g6 + TI_CPU]
-
- call prom_init_report
- nop
-
- /* Off we go.... */
- call start_kernel
+ call start_early_boot
nop
/* Not reached... */

--- a/arch/sparc/kernel/hvtramp.S
+++ b/arch/sparc/kernel/hvtramp.S
@@ -109,7 +109,6 @@ hv_cpu_startup:
sllx %g5, THREAD_SHIFT, %g5
sub %g5, (STACKFRAME_SZ + STACK_BIAS), %g5
add %g6, %g5, %sp
- mov 0, %fp

call init_irqwork_curcpu
nop
--- a/arch/sparc/kernel/setup_64.c
+++ b/arch/sparc/kernel/setup_64.c
@@ -30,6 +30,7 @@
#include <linux/cpu.h>
#include <linux/initrd.h>
#include <linux/module.h>
+#include <linux/start_kernel.h>

#include <asm/io.h>
#include <asm/processor.h>
@@ -174,7 +175,7 @@ char reboot_command[COMMAND_LINE_SIZE];

static struct pt_regs fake_swapper_regs = { { 0, }, 0, 0, 0, 0 };

-void __init per_cpu_patch(void)
+static void __init per_cpu_patch(void)
{
struct cpuid_patch_entry *p;
unsigned long ver;
@@ -266,7 +267,7 @@ void sun4v_patch_2insn_range(struct sun4
}
}

-void __init sun4v_patch(void)
+static void __init sun4v_patch(void)
{
extern void sun4v_hvapi_init(void);

@@ -335,14 +336,25 @@ static void __init pause_patch(void)
}
}

-#ifdef CONFIG_SMP
-void __init boot_cpu_id_too_large(int cpu)
+void __init start_early_boot(void)
{
- prom_printf("Serious problem, boot cpu id (%d) >= NR_CPUS (%d)\n",
- cpu, NR_CPUS);
- prom_halt();
+ int cpu;
+
+ check_if_starfire();
+ per_cpu_patch();
+ sun4v_patch();
+
+ cpu = hard_smp_processor_id();
+ if (cpu >= NR_CPUS) {
+ prom_printf("Serious problem, boot cpu id (%d) >= NR_CPUS (%d)\n",
+ cpu, NR_CPUS);
+ prom_halt();
+ }
+ current_thread_info()->cpu = cpu;
+
+ prom_init_report();
+ start_kernel();
}
-#endif

/* On Ultra, we support all of the v8 capabilities. */
unsigned long sparc64_elf_hwcap = (HWCAP_SPARC_FLUSH | HWCAP_SPARC_STBAR |
--- a/arch/sparc/kernel/trampoline_64.S
+++ b/arch/sparc/kernel/trampoline_64.S
@@ -109,10 +109,13 @@ startup_continue:
brnz,pn %g1, 1b
nop

- sethi %hi(p1275buf), %g2
- or %g2, %lo(p1275buf), %g2
- ldx [%g2 + 0x10], %l2
- add %l2, -(192 + 128), %sp
+ /* Get onto temporary stack which will be in the locked
+ * kernel image.
+ */
+ sethi %hi(tramp_stack), %g1
+ or %g1, %lo(tramp_stack), %g1
+ add %g1, TRAMP_STACK_SIZE, %g1
+ sub %g1, STACKFRAME_SZ + STACK_BIAS + 256, %sp
flushw

/* Setup the loop variables:
@@ -394,7 +397,6 @@ after_lock_tlb:
sllx %g5, THREAD_SHIFT, %g5
sub %g5, (STACKFRAME_SZ + STACK_BIAS), %g5
add %g6, %g5, %sp
- mov 0, %fp

rdpr %pstate, %o1
or %o1, PSTATE_IE, %o1
--- a/arch/sparc/prom/cif.S
+++ b/arch/sparc/prom/cif.S
@@ -11,11 +11,10 @@
.text
.globl prom_cif_direct
prom_cif_direct:
+ save %sp, -192, %sp
sethi %hi(p1275buf), %o1
or %o1, %lo(p1275buf), %o1
- ldx [%o1 + 0x0010], %o2 ! prom_cif_stack
- save %o2, -192, %sp
- ldx [%i1 + 0x0008], %l2 ! prom_cif_handler
+ ldx [%o1 + 0x0008], %l2 ! prom_cif_handler
mov %g4, %l0
mov %g5, %l1
mov %g6, %l3
--- a/arch/sparc/prom/init_64.c
+++ b/arch/sparc/prom/init_64.c
@@ -26,13 +26,13 @@ phandle prom_chosen_node;
* It gets passed the pointer to the PROM vector.
*/

-extern void prom_cif_init(void *, void *);
+extern void prom_cif_init(void *);

-void __init prom_init(void *cif_handler, void *cif_stack)
+void __init prom_init(void *cif_handler)
{
phandle node;

- prom_cif_init(cif_handler, cif_stack);
+ prom_cif_init(cif_handler);

prom_chosen_node = prom_finddevice(prom_chosen_path);
if (!prom_chosen_node || (s32)prom_chosen_node == -1)
--- a/arch/sparc/prom/p1275.c
+++ b/arch/sparc/prom/p1275.c
@@ -20,7 +20,6 @@
struct {
long prom_callback; /* 0x00 */
void (*prom_cif_handler)(long *); /* 0x08 */
- unsigned long prom_cif_stack; /* 0x10 */
} p1275buf;

extern void prom_world(int);
@@ -52,5 +51,4 @@ void p1275_cmd_direct(unsigned long *arg
void prom_cif_init(void *cif_handler, void *cif_stack)
{
p1275buf.prom_cif_handler = (void (*)(long *))cif_handler;
- p1275buf.prom_cif_stack = (unsigned long)cif_stack;
}

2014-10-28 07:32:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 143/146] sparc64: Kill unnecessary tables and increase MAX_BANKS.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit d195b71bad4347d2df51072a537f922546a904f1 ]

swapper_low_pmd_dir and swapper_pud_dir are actually completely
useless and unnecessary.

We just need swapper_pg_dir[]. Naturally the other page table chunks
will be allocated on an as-needed basis. Since the kernel actually
accesses these tables in the PAGE_OFFSET view, there is not even a TLB
locality advantage of placing them in the kernel image.

Use the hard coded vmlinux.ld.S slot for swapper_pg_dir which is
naturally page aligned.

Increase MAX_BANKS to 1024 in order to handle heavily fragmented
virtual guests.

Even with this MAX_BANKS increase, the kernel is 20K+ smaller.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/pgtable_64.h | 1 -
arch/sparc/kernel/vmlinux.lds.S | 5 +++--
arch/sparc/mm/init_64.c | 25 ++-----------------------
3 files changed, 5 insertions(+), 26 deletions(-)

--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -927,7 +927,6 @@ static inline void __set_pte_at(struct m
#endif

extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
-extern pmd_t swapper_low_pmd_dir[PTRS_PER_PMD];

void paging_init(void);
unsigned long find_ecache_flush_span(unsigned long size);
--- a/arch/sparc/kernel/vmlinux.lds.S
+++ b/arch/sparc/kernel/vmlinux.lds.S
@@ -35,8 +35,9 @@ jiffies = jiffies_64;

SECTIONS
{
- /* swapper_low_pmd_dir is sparc64 only */
- swapper_low_pmd_dir = 0x0000000000402000;
+#ifdef CONFIG_SPARC64
+ swapper_pg_dir = 0x0000000000402000;
+#endif
. = INITIAL_ADDRESS;
.text TEXTSTART :
{
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -87,7 +87,7 @@ extern struct tsb swapper_tsb[KERNEL_TSB

static unsigned long cpu_pgsz_mask;

-#define MAX_BANKS 32
+#define MAX_BANKS 1024

static struct linux_prom64_registers pavail[MAX_BANKS];
static int pavail_ents;
@@ -1943,12 +1943,6 @@ static void __init sun4v_linear_pte_xor_

static unsigned long last_valid_pfn;

-/* These must be page aligned in order to not trigger the
- * alignment tests of pgd_bad() and pud_bad().
- */
-pgd_t swapper_pg_dir[PTRS_PER_PGD] __attribute__ ((aligned (PAGE_SIZE)));
-static pud_t swapper_pud_dir[PTRS_PER_PUD] __attribute__ ((aligned (PAGE_SIZE)));
-
static void sun4u_pgprot_init(void);
static void sun4v_pgprot_init(void);

@@ -1956,8 +1950,6 @@ void __init paging_init(void)
{
unsigned long end_pfn, shift, phys_base;
unsigned long real_end, i;
- pud_t *pud;
- pmd_t *pmd;
int node;

setup_page_offset();
@@ -2052,20 +2044,7 @@ void __init paging_init(void)
*/
init_mm.pgd += ((shift) / (sizeof(pgd_t)));

- memset(swapper_low_pmd_dir, 0, sizeof(swapper_low_pmd_dir));
-
- /* The kernel page tables we publish into what the rest of the
- * world sees must be adjusted so that they see the PAGE_OFFSET
- * address of these in-kerenel data structures. However right
- * here we must access them from the kernel image side, because
- * the trap tables haven't been taken over and therefore we cannot
- * take TLB misses in the PAGE_OFFSET linear mappings yet.
- */
- pud = swapper_pud_dir + (shift / sizeof(pud_t));
- pgd_set(&swapper_pg_dir[0], pud);
-
- pmd = swapper_low_pmd_dir + (shift / sizeof(pmd_t));
- pud_set(&swapper_pud_dir[0], pmd);
+ memset(swapper_pg_dir, 0, sizeof(swapper_pg_dir));

inherit_prom_mappings();


2014-10-28 07:33:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 141/146] sparc64: Adjust vmalloc region size based upon available virtual address bits.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit bb4e6e85daa52a9f6210fa06a5ec6269598a202b ]

In order to accomodate embedded per-cpu allocation with large numbers
of cpus and numa nodes, we have to use as much virtual address space
as possible for the vmalloc region. Otherwise we can get things like:

PERCPU: max_distance=0x380001c10000 too large for vmalloc space 0xff00000000

So, once we select a value for PAGE_OFFSET, derive the size of the
vmalloc region based upon that.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/page_64.h | 1 -
arch/sparc/include/asm/pgtable_64.h | 9 +++++----
arch/sparc/kernel/ktlb.S | 8 ++++----
arch/sparc/mm/init_64.c | 30 +++++++++++++++++++-----------
4 files changed, 28 insertions(+), 20 deletions(-)

--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -117,7 +117,6 @@ extern unsigned long sparc64_va_hole_bot

#include <asm-generic/memory_model.h>

-#define PAGE_OFFSET_BY_BITS(X) (-(_AC(1,UL) << (X)))
extern unsigned long PAGE_OFFSET;

#endif /* !(__ASSEMBLY__) */
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -40,10 +40,7 @@
#define LOW_OBP_ADDRESS _AC(0x00000000f0000000,UL)
#define HI_OBP_ADDRESS _AC(0x0000000100000000,UL)
#define VMALLOC_START _AC(0x0000000100000000,UL)
-#define VMALLOC_END _AC(0x0000010000000000,UL)
-#define VMEMMAP_BASE _AC(0x0000010000000000,UL)
-
-#define vmemmap ((struct page *)VMEMMAP_BASE)
+#define VMEMMAP_BASE VMALLOC_END

/* PMD_SHIFT determines the size of the area a second-level page
* table can map
@@ -81,6 +78,10 @@

#ifndef __ASSEMBLY__

+extern unsigned long VMALLOC_END;
+
+#define vmemmap ((struct page *)VMEMMAP_BASE)
+
#include <linux/sched.h>

bool kern_addr_valid(unsigned long addr);
--- a/arch/sparc/kernel/ktlb.S
+++ b/arch/sparc/kernel/ktlb.S
@@ -199,8 +199,8 @@ kvmap_dtlb_nonlinear:

#ifdef CONFIG_SPARSEMEM_VMEMMAP
/* Do not use the TSB for vmemmap. */
- mov (VMEMMAP_BASE >> 40), %g5
- sllx %g5, 40, %g5
+ sethi %hi(VMEMMAP_BASE), %g5
+ ldx [%g5 + %lo(VMEMMAP_BASE)], %g5
cmp %g4,%g5
bgeu,pn %xcc, kvmap_vmemmap
nop
@@ -212,8 +212,8 @@ kvmap_dtlb_tsbmiss:
sethi %hi(MODULES_VADDR), %g5
cmp %g4, %g5
blu,pn %xcc, kvmap_dtlb_longpath
- mov (VMALLOC_END >> 40), %g5
- sllx %g5, 40, %g5
+ sethi %hi(VMALLOC_END), %g5
+ ldx [%g5 + %lo(VMALLOC_END)], %g5
cmp %g4, %g5
bgeu,pn %xcc, kvmap_dtlb_longpath
nop
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -1369,25 +1369,24 @@ static unsigned long max_phys_bits = 40;

bool kern_addr_valid(unsigned long addr)
{
- unsigned long above = ((long)addr) >> max_phys_bits;
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;

- if (above != 0 && above != -1UL)
- return false;
-
- if (addr >= (unsigned long) KERNBASE &&
- addr < (unsigned long)&_end)
- return true;
-
- if (addr >= PAGE_OFFSET) {
+ if ((long)addr < 0L) {
unsigned long pa = __pa(addr);

+ if ((addr >> max_phys_bits) != 0UL)
+ return false;
+
return pfn_valid(pa >> PAGE_SHIFT);
}

+ if (addr >= (unsigned long) KERNBASE &&
+ addr < (unsigned long)&_end)
+ return true;
+
pgd = pgd_offset_k(addr);
if (pgd_none(*pgd))
return 0;
@@ -1656,6 +1655,9 @@ unsigned long __init find_ecache_flush_s
unsigned long PAGE_OFFSET;
EXPORT_SYMBOL(PAGE_OFFSET);

+unsigned long VMALLOC_END = 0x0000010000000000UL;
+EXPORT_SYMBOL(VMALLOC_END);
+
unsigned long sparc64_va_hole_top = 0xfffff80000000000UL;
unsigned long sparc64_va_hole_bottom = 0x0000080000000000UL;

@@ -1712,10 +1714,16 @@ static void __init setup_page_offset(voi
prom_halt();
}

- PAGE_OFFSET = PAGE_OFFSET_BY_BITS(max_phys_bits);
+ PAGE_OFFSET = sparc64_va_hole_top;
+ VMALLOC_END = ((sparc64_va_hole_bottom >> 1) +
+ (sparc64_va_hole_bottom >> 2));

- pr_info("PAGE_OFFSET is 0x%016lx (max_phys_bits == %lu)\n",
+ pr_info("MM: PAGE_OFFSET is 0x%016lx (max_phys_bits == %lu)\n",
PAGE_OFFSET, max_phys_bits);
+ pr_info("MM: VMALLOC [0x%016lx --> 0x%016lx]\n",
+ VMALLOC_START, VMALLOC_END);
+ pr_info("MM: VMEMMAP [0x%016lx --> 0x%016lx]\n",
+ VMEMMAP_BASE, VMEMMAP_BASE << 1);
}

static void __init tsb_phys_patch(void)

2014-10-28 03:41:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 137/146] sparc64: Adjust KTSB assembler to support larger physical addresses.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit 8c82dc0e883821c098c8b0b130ffebabf9aab5df ]

As currently coded the KTSB accesses in the kernel only support up to
47 bits of physical addressing.

Adjust the instruction and patching sequence in order to support
arbitrary 64 bits addresses.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/tsb.h | 30 ++++++++++++------------------
arch/sparc/mm/init_64.c | 28 +++++++++++++++++++++++++---
2 files changed, 37 insertions(+), 21 deletions(-)

--- a/arch/sparc/include/asm/tsb.h
+++ b/arch/sparc/include/asm/tsb.h
@@ -256,8 +256,6 @@ extern struct tsb_phys_patch_entry __tsb
(KERNEL_TSB_SIZE_BYTES / 16)
#define KERNEL_TSB4M_NENTRIES 4096

-#define KTSB_PHYS_SHIFT 15
-
/* Do a kernel TSB lookup at tl>0 on VADDR+TAG, branch to OK_LABEL
* on TSB hit. REG1, REG2, REG3, and REG4 are used as temporaries
* and the found TTE will be left in REG1. REG3 and REG4 must
@@ -266,17 +264,15 @@ extern struct tsb_phys_patch_entry __tsb
* VADDR and TAG will be preserved and not clobbered by this macro.
*/
#define KERN_TSB_LOOKUP_TL1(VADDR, TAG, REG1, REG2, REG3, REG4, OK_LABEL) \
-661: sethi %hi(swapper_tsb), REG1; \
- or REG1, %lo(swapper_tsb), REG1; \
+661: sethi %uhi(swapper_tsb), REG1; \
+ sethi %hi(swapper_tsb), REG2; \
+ or REG1, %ulo(swapper_tsb), REG1; \
+ or REG2, %lo(swapper_tsb), REG2; \
.section .swapper_tsb_phys_patch, "ax"; \
.word 661b; \
.previous; \
-661: nop; \
- .section .tsb_ldquad_phys_patch, "ax"; \
- .word 661b; \
- sllx REG1, KTSB_PHYS_SHIFT, REG1; \
- sllx REG1, KTSB_PHYS_SHIFT, REG1; \
- .previous; \
+ sllx REG1, 32, REG1; \
+ or REG1, REG2, REG1; \
srlx VADDR, PAGE_SHIFT, REG2; \
and REG2, (KERNEL_TSB_NENTRIES - 1), REG2; \
sllx REG2, 4, REG2; \
@@ -291,17 +287,15 @@ extern struct tsb_phys_patch_entry __tsb
* we can make use of that for the index computation.
*/
#define KERN_TSB4M_LOOKUP_TL1(TAG, REG1, REG2, REG3, REG4, OK_LABEL) \
-661: sethi %hi(swapper_4m_tsb), REG1; \
- or REG1, %lo(swapper_4m_tsb), REG1; \
+661: sethi %uhi(swapper_4m_tsb), REG1; \
+ sethi %hi(swapper_4m_tsb), REG2; \
+ or REG1, %ulo(swapper_4m_tsb), REG1; \
+ or REG2, %lo(swapper_4m_tsb), REG2; \
.section .swapper_4m_tsb_phys_patch, "ax"; \
.word 661b; \
.previous; \
-661: nop; \
- .section .tsb_ldquad_phys_patch, "ax"; \
- .word 661b; \
- sllx REG1, KTSB_PHYS_SHIFT, REG1; \
- sllx REG1, KTSB_PHYS_SHIFT, REG1; \
- .previous; \
+ sllx REG1, 32, REG1; \
+ or REG1, REG2, REG1; \
and TAG, (KERNEL_TSB4M_NENTRIES - 1), REG2; \
sllx REG2, 4, REG2; \
add REG1, REG2, REG2; \
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -1733,19 +1733,41 @@ static void __init tsb_phys_patch(void)
static struct hv_tsb_descr ktsb_descr[NUM_KTSB_DESCR];
extern struct tsb swapper_tsb[KERNEL_TSB_NENTRIES];

+/* The swapper TSBs are loaded with a base sequence of:
+ *
+ * sethi %uhi(SYMBOL), REG1
+ * sethi %hi(SYMBOL), REG2
+ * or REG1, %ulo(SYMBOL), REG1
+ * or REG2, %lo(SYMBOL), REG2
+ * sllx REG1, 32, REG1
+ * or REG1, REG2, REG1
+ *
+ * When we use physical addressing for the TSB accesses, we patch the
+ * first four instructions in the above sequence.
+ */
+
static void patch_one_ktsb_phys(unsigned int *start, unsigned int *end, unsigned long pa)
{
- pa >>= KTSB_PHYS_SHIFT;
+ unsigned long high_bits, low_bits;
+
+ high_bits = (pa >> 32) & 0xffffffff;
+ low_bits = (pa >> 0) & 0xffffffff;

while (start < end) {
unsigned int *ia = (unsigned int *)(unsigned long)*start;

- ia[0] = (ia[0] & ~0x3fffff) | (pa >> 10);
+ ia[0] = (ia[0] & ~0x3fffff) | (high_bits >> 10);
__asm__ __volatile__("flush %0" : : "r" (ia));

- ia[1] = (ia[1] & ~0x3ff) | (pa & 0x3ff);
+ ia[1] = (ia[1] & ~0x3fffff) | (low_bits >> 10);
__asm__ __volatile__("flush %0" : : "r" (ia + 1));

+ ia[2] = (ia[2] & ~0x1fff) | (high_bits & 0x3ff);
+ __asm__ __volatile__("flush %0" : : "r" (ia + 2));
+
+ ia[3] = (ia[3] & ~0x1fff) | (low_bits & 0x3ff);
+ __asm__ __volatile__("flush %0" : : "r" (ia + 3));
+
start++;
}
}

2014-10-28 07:33:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 139/146] sparc64: Use kernel page tables for vmemmap.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit c06240c7f5c39c83dfd7849c0770775562441b96 ]

For sparse memory configurations, the vmemmap array behaves terribly
and it takes up an inordinate amount of space in the BSS section of
the kernel image unconditionally.

Just build huge PMDs and look them up just like we do for TLB misses
in the vmalloc area.

Kernel BSS shrinks by about 2MB.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/ktlb.S | 9 +----
arch/sparc/mm/init_64.c | 72 ++++++++++++++++++++++-------------------------
arch/sparc/mm/init_64.h | 11 -------
3 files changed, 36 insertions(+), 56 deletions(-)

--- a/arch/sparc/kernel/ktlb.S
+++ b/arch/sparc/kernel/ktlb.S
@@ -186,13 +186,8 @@ kvmap_dtlb_load:

#ifdef CONFIG_SPARSEMEM_VMEMMAP
kvmap_vmemmap:
- sub %g4, %g5, %g5
- srlx %g5, ILOG2_4MB, %g5
- sethi %hi(vmemmap_table), %g1
- sllx %g5, 3, %g5
- or %g1, %lo(vmemmap_table), %g1
- ba,pt %xcc, kvmap_dtlb_load
- ldx [%g1 + %g5], %g5
+ KERN_PGTABLE_WALK(%g4, %g5, %g2, kvmap_dtlb_longpath)
+ ba,a,pt %xcc, kvmap_dtlb_load
#endif

kvmap_dtlb_nonlinear:
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2261,18 +2261,9 @@ unsigned long _PAGE_CACHE __read_mostly;
EXPORT_SYMBOL(_PAGE_CACHE);

#ifdef CONFIG_SPARSEMEM_VMEMMAP
-unsigned long vmemmap_table[VMEMMAP_SIZE];
-
-static long __meminitdata addr_start, addr_end;
-static int __meminitdata node_start;
-
int __meminit vmemmap_populate(unsigned long vstart, unsigned long vend,
int node)
{
- unsigned long phys_start = (vstart - VMEMMAP_BASE);
- unsigned long phys_end = (vend - VMEMMAP_BASE);
- unsigned long addr = phys_start & VMEMMAP_CHUNK_MASK;
- unsigned long end = VMEMMAP_ALIGN(phys_end);
unsigned long pte_base;

pte_base = (_PAGE_VALID | _PAGE_SZ4MB_4U |
@@ -2283,47 +2274,52 @@ int __meminit vmemmap_populate(unsigned
_PAGE_CP_4V | _PAGE_CV_4V |
_PAGE_P_4V | _PAGE_W_4V);

- for (; addr < end; addr += VMEMMAP_CHUNK) {
- unsigned long *vmem_pp =
- vmemmap_table + (addr >> VMEMMAP_CHUNK_SHIFT);
- void *block;
+ pte_base |= _PAGE_PMD_HUGE;

- if (!(*vmem_pp & _PAGE_VALID)) {
- block = vmemmap_alloc_block(1UL << ILOG2_4MB, node);
- if (!block)
+ vstart = vstart & PMD_MASK;
+ vend = ALIGN(vend, PMD_SIZE);
+ for (; vstart < vend; vstart += PMD_SIZE) {
+ pgd_t *pgd = pgd_offset_k(vstart);
+ unsigned long pte;
+ pud_t *pud;
+ pmd_t *pmd;
+
+ if (pgd_none(*pgd)) {
+ pud_t *new = vmemmap_alloc_block(PAGE_SIZE, node);
+
+ if (!new)
return -ENOMEM;
+ pgd_populate(&init_mm, pgd, new);
+ }

- *vmem_pp = pte_base | __pa(block);
+ pud = pud_offset(pgd, vstart);
+ if (pud_none(*pud)) {
+ pmd_t *new = vmemmap_alloc_block(PAGE_SIZE, node);

- /* check to see if we have contiguous blocks */
- if (addr_end != addr || node_start != node) {
- if (addr_start)
- printk(KERN_DEBUG " [%lx-%lx] on node %d\n",
- addr_start, addr_end-1, node_start);
- addr_start = addr;
- node_start = node;
- }
- addr_end = addr + VMEMMAP_CHUNK;
+ if (!new)
+ return -ENOMEM;
+ pud_populate(&init_mm, pud, new);
}
- }
- return 0;
-}

-void __meminit vmemmap_populate_print_last(void)
-{
- if (addr_start) {
- printk(KERN_DEBUG " [%lx-%lx] on node %d\n",
- addr_start, addr_end-1, node_start);
- addr_start = 0;
- addr_end = 0;
- node_start = 0;
+ pmd = pmd_offset(pud, vstart);
+
+ pte = pmd_val(*pmd);
+ if (!(pte & _PAGE_VALID)) {
+ void *block = vmemmap_alloc_block(PMD_SIZE, node);
+
+ if (!block)
+ return -ENOMEM;
+
+ pmd_val(*pmd) = pte_base | __pa(block);
+ }
}
+
+ return 0;
}

void vmemmap_free(unsigned long start, unsigned long end)
{
}
-
#endif /* CONFIG_SPARSEMEM_VMEMMAP */

static void prot_init_common(unsigned long page_none,
--- a/arch/sparc/mm/init_64.h
+++ b/arch/sparc/mm/init_64.h
@@ -31,15 +31,4 @@ extern unsigned long kern_locked_tte_dat

void prom_world(int enter);

-#ifdef CONFIG_SPARSEMEM_VMEMMAP
-#define VMEMMAP_CHUNK_SHIFT 22
-#define VMEMMAP_CHUNK (1UL << VMEMMAP_CHUNK_SHIFT)
-#define VMEMMAP_CHUNK_MASK ~(VMEMMAP_CHUNK - 1UL)
-#define VMEMMAP_ALIGN(x) (((x)+VMEMMAP_CHUNK-1UL)&VMEMMAP_CHUNK_MASK)
-
-#define VMEMMAP_SIZE ((((1UL << MAX_PHYSADDR_BITS) >> PAGE_SHIFT) * \
- sizeof(struct page)) >> VMEMMAP_CHUNK_SHIFT)
-extern unsigned long vmemmap_table[VMEMMAP_SIZE];
-#endif
-
#endif /* _SPARC64_MM_INIT_H */

2014-10-28 07:34:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 138/146] sparc64: Fix physical memory management regressions with large max_phys_bits.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit 0dd5b7b09e13dae32869371e08e1048349fd040c ]

If max_phys_bits needs to be > 43 (f.e. for T4 chips), things like
DEBUG_PAGEALLOC stop working because the 3-level page tables only
can cover up to 43 bits.

Another problem is that when we increased MAX_PHYS_ADDRESS_BITS up to
47, several statically allocated tables became enormous.

Compounding this is that we will need to support up to 49 bits of
physical addressing for M7 chips.

The two tables in question are sparc64_valid_addr_bitmap and
kpte_linear_bitmap.

The first holds a bitmap, with 1 bit for each 4MB chunk of physical
memory, indicating whether that chunk actually exists in the machine
and is valid.

The second table is a set of 2-bit values which tell how large of a
mapping (4MB, 256MB, 2GB, 16GB, respectively) we can use at each 256MB
chunk of ram in the system.

These tables are huge and take up an enormous amount of the BSS
section of the sparc64 kernel image. Specifically, the
sparc64_valid_addr_bitmap is 4MB, and the kpte_linear_bitmap is 128K.

So let's solve the space wastage and the DEBUG_PAGEALLOC problem
at the same time, by using the kernel page tables (as designed) to
manage this information.

We have to keep using large mappings when DEBUG_PAGEALLOC is disabled,
and we do this by encoding huge PMDs and PUDs.

On a T4-2 with 256GB of ram the kernel page table takes up 16K with
DEBUG_PAGEALLOC disabled and 256MB with it enabled. Furthermore, this
memory is dynamically allocated at run time rather than coded
statically into the kernel image.

Signed-off-by: David S. Miller <[email protected]>
Acked-by: Bob Picco <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/page_64.h | 3
arch/sparc/include/asm/pgtable_64.h | 55 ++---
arch/sparc/include/asm/tsb.h | 47 +++-
arch/sparc/kernel/ktlb.S | 108 ---------
arch/sparc/kernel/vmlinux.lds.S | 5
arch/sparc/mm/init_64.c | 393 +++++++++++++++---------------------
arch/sparc/mm/init_64.h | 7
7 files changed, 244 insertions(+), 374 deletions(-)

--- a/arch/sparc/include/asm/page_64.h
+++ b/arch/sparc/include/asm/page_64.h
@@ -128,9 +128,6 @@ extern unsigned long PAGE_OFFSET;
*/
#define MAX_PHYS_ADDRESS_BITS 47

-/* These two shift counts are used when indexing sparc64_valid_addr_bitmap
- * and kpte_linear_bitmap.
- */
#define ILOG2_4MB 22
#define ILOG2_256MB 28

--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -79,22 +79,7 @@

#include <linux/sched.h>

-extern unsigned long sparc64_valid_addr_bitmap[];
-
-/* Needs to be defined here and not in linux/mm.h, as it is arch dependent */
-static inline bool __kern_addr_valid(unsigned long paddr)
-{
- if ((paddr >> MAX_PHYS_ADDRESS_BITS) != 0UL)
- return false;
- return test_bit(paddr >> ILOG2_4MB, sparc64_valid_addr_bitmap);
-}
-
-static inline bool kern_addr_valid(unsigned long addr)
-{
- unsigned long paddr = __pa(addr);
-
- return __kern_addr_valid(paddr);
-}
+bool kern_addr_valid(unsigned long addr);

/* Entries per page directory level. */
#define PTRS_PER_PTE (1UL << (PAGE_SHIFT-3))
@@ -122,6 +107,7 @@ static inline bool kern_addr_valid(unsig
#define _PAGE_R _AC(0x8000000000000000,UL) /* Keep ref bit uptodate*/
#define _PAGE_SPECIAL _AC(0x0200000000000000,UL) /* Special page */
#define _PAGE_PMD_HUGE _AC(0x0100000000000000,UL) /* Huge page */
+#define _PAGE_PUD_HUGE _PAGE_PMD_HUGE

/* Advertise support for _PAGE_SPECIAL */
#define __HAVE_ARCH_PTE_SPECIAL
@@ -668,26 +654,26 @@ static inline unsigned long pmd_large(pm
return pte_val(pte) & _PAGE_PMD_HUGE;
}

-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-static inline unsigned long pmd_young(pmd_t pmd)
+static inline unsigned long pmd_pfn(pmd_t pmd)
{
pte_t pte = __pte(pmd_val(pmd));

- return pte_young(pte);
+ return pte_pfn(pte);
}

-static inline unsigned long pmd_write(pmd_t pmd)
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+static inline unsigned long pmd_young(pmd_t pmd)
{
pte_t pte = __pte(pmd_val(pmd));

- return pte_write(pte);
+ return pte_young(pte);
}

-static inline unsigned long pmd_pfn(pmd_t pmd)
+static inline unsigned long pmd_write(pmd_t pmd)
{
pte_t pte = __pte(pmd_val(pmd));

- return pte_pfn(pte);
+ return pte_write(pte);
}

static inline unsigned long pmd_trans_huge(pmd_t pmd)
@@ -781,18 +767,15 @@ static inline int pmd_present(pmd_t pmd)
* the top bits outside of the range of any physical address size we
* support are clear as well. We also validate the physical itself.
*/
-#define pmd_bad(pmd) ((pmd_val(pmd) & ~PAGE_MASK) || \
- !__kern_addr_valid(pmd_val(pmd)))
+#define pmd_bad(pmd) (pmd_val(pmd) & ~PAGE_MASK)

#define pud_none(pud) (!pud_val(pud))

-#define pud_bad(pud) ((pud_val(pud) & ~PAGE_MASK) || \
- !__kern_addr_valid(pud_val(pud)))
+#define pud_bad(pud) (pud_val(pud) & ~PAGE_MASK)

#define pgd_none(pgd) (!pgd_val(pgd))

-#define pgd_bad(pgd) ((pgd_val(pgd) & ~PAGE_MASK) || \
- !__kern_addr_valid(pgd_val(pgd)))
+#define pgd_bad(pgd) (pgd_val(pgd) & ~PAGE_MASK)

#ifdef CONFIG_TRANSPARENT_HUGEPAGE
void set_pmd_at(struct mm_struct *mm, unsigned long addr,
@@ -835,6 +818,20 @@ static inline unsigned long __pmd_page(p
#define pgd_present(pgd) (pgd_val(pgd) != 0U)
#define pgd_clear(pgdp) (pgd_val(*(pgd)) = 0UL)

+static inline unsigned long pud_large(pud_t pud)
+{
+ pte_t pte = __pte(pud_val(pud));
+
+ return pte_val(pte) & _PAGE_PMD_HUGE;
+}
+
+static inline unsigned long pud_pfn(pud_t pud)
+{
+ pte_t pte = __pte(pud_val(pud));
+
+ return pte_pfn(pte);
+}
+
/* Same in both SUN4V and SUN4U. */
#define pte_none(pte) (!pte_val(pte))

--- a/arch/sparc/include/asm/tsb.h
+++ b/arch/sparc/include/asm/tsb.h
@@ -133,9 +133,24 @@ extern struct tsb_phys_patch_entry __tsb
sub TSB, 0x8, TSB; \
TSB_STORE(TSB, TAG);

- /* Do a kernel page table walk. Leaves physical PTE pointer in
- * REG1. Jumps to FAIL_LABEL on early page table walk termination.
- * VADDR will not be clobbered, but REG2 will.
+ /* Do a kernel page table walk. Leaves valid PTE value in
+ * REG1. Jumps to FAIL_LABEL on early page table walk
+ * termination. VADDR will not be clobbered, but REG2 will.
+ *
+ * There are two masks we must apply to propagate bits from
+ * the virtual address into the PTE physical address field
+ * when dealing with huge pages. This is because the page
+ * table boundaries do not match the huge page size(s) the
+ * hardware supports.
+ *
+ * In these cases we propagate the bits that are below the
+ * page table level where we saw the huge page mapping, but
+ * are still within the relevant physical bits for the huge
+ * page size in question. So for PMD mappings (which fall on
+ * bit 23, for 8MB per PMD) we must propagate bit 22 for a
+ * 4MB huge page. For huge PUDs (which fall on bit 33, for
+ * 8GB per PUD), we have to accomodate 256MB and 2GB huge
+ * pages. So for those we propagate bits 32 to 28.
*/
#define KERN_PGTABLE_WALK(VADDR, REG1, REG2, FAIL_LABEL) \
sethi %hi(swapper_pg_dir), REG1; \
@@ -150,15 +165,35 @@ extern struct tsb_phys_patch_entry __tsb
andn REG2, 0x7, REG2; \
ldxa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \
brz,pn REG1, FAIL_LABEL; \
- sllx VADDR, 64 - (PMD_SHIFT + PMD_BITS), REG2; \
+ sethi %uhi(_PAGE_PUD_HUGE), REG2; \
+ brz,pn REG1, FAIL_LABEL; \
+ sllx REG2, 32, REG2; \
+ andcc REG1, REG2, %g0; \
+ sethi %hi(0xf8000000), REG2; \
+ bne,pt %xcc, 697f; \
+ sllx REG2, 1, REG2; \
+ sllx VADDR, 64 - (PMD_SHIFT + PMD_BITS), REG2; \
srlx REG2, 64 - PAGE_SHIFT, REG2; \
andn REG2, 0x7, REG2; \
ldxa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \
+ sethi %uhi(_PAGE_PMD_HUGE), REG2; \
brz,pn REG1, FAIL_LABEL; \
- sllx VADDR, 64 - PMD_SHIFT, REG2; \
+ sllx REG2, 32, REG2; \
+ andcc REG1, REG2, %g0; \
+ be,pn %xcc, 698f; \
+ sethi %hi(0x400000), REG2; \
+697: brgez,pn REG1, FAIL_LABEL; \
+ andn REG1, REG2, REG1; \
+ and VADDR, REG2, REG2; \
+ ba,pt %xcc, 699f; \
+ or REG1, REG2, REG1; \
+698: sllx VADDR, 64 - PMD_SHIFT, REG2; \
srlx REG2, 64 - PAGE_SHIFT, REG2; \
andn REG2, 0x7, REG2; \
- add REG1, REG2, REG1;
+ ldxa [REG1 + REG2] ASI_PHYS_USE_EC, REG1; \
+ brgez,pn REG1, FAIL_LABEL; \
+ nop; \
+699:

/* PMD has been loaded into REG1, interpret the value, seeing
* if it is a HUGE PMD or a normal one. If it is not valid
--- a/arch/sparc/kernel/ktlb.S
+++ b/arch/sparc/kernel/ktlb.S
@@ -47,14 +47,6 @@ kvmap_itlb_vmalloc_addr:
KERN_PGTABLE_WALK(%g4, %g5, %g2, kvmap_itlb_longpath)

TSB_LOCK_TAG(%g1, %g2, %g7)
-
- /* Load and check PTE. */
- ldxa [%g5] ASI_PHYS_USE_EC, %g5
- mov 1, %g7
- sllx %g7, TSB_TAG_INVALID_BIT, %g7
- brgez,a,pn %g5, kvmap_itlb_longpath
- TSB_STORE(%g1, %g7)
-
TSB_WRITE(%g1, %g5, %g6)

/* fallthrough to TLB load */
@@ -118,6 +110,12 @@ kvmap_dtlb_obp:
ba,pt %xcc, kvmap_dtlb_load
nop

+kvmap_linear_early:
+ sethi %hi(kern_linear_pte_xor), %g7
+ ldx [%g7 + %lo(kern_linear_pte_xor)], %g2
+ ba,pt %xcc, kvmap_dtlb_tsb4m_load
+ xor %g2, %g4, %g5
+
.align 32
kvmap_dtlb_tsb4m_load:
TSB_LOCK_TAG(%g1, %g2, %g7)
@@ -146,105 +144,17 @@ kvmap_dtlb_4v:
/* Correct TAG_TARGET is already in %g6, check 4mb TSB. */
KERN_TSB4M_LOOKUP_TL1(%g6, %g5, %g1, %g2, %g3, kvmap_dtlb_load)
#endif
- /* TSB entry address left in %g1, lookup linear PTE.
- * Must preserve %g1 and %g6 (TAG).
- */
-kvmap_dtlb_tsb4m_miss:
- /* Clear the PAGE_OFFSET top virtual bits, shift
- * down to get PFN, and make sure PFN is in range.
- */
-661: sllx %g4, 0, %g5
- .section .page_offset_shift_patch, "ax"
- .word 661b
- .previous
-
- /* Check to see if we know about valid memory at the 4MB
- * chunk this physical address will reside within.
+ /* Linear mapping TSB lookup failed. Fallthrough to kernel
+ * page table based lookup.
*/
-661: srlx %g5, MAX_PHYS_ADDRESS_BITS, %g2
- .section .page_offset_shift_patch, "ax"
- .word 661b
- .previous
-
- brnz,pn %g2, kvmap_dtlb_longpath
- nop
-
- /* This unconditional branch and delay-slot nop gets patched
- * by the sethi sequence once the bitmap is properly setup.
- */
- .globl valid_addr_bitmap_insn
-valid_addr_bitmap_insn:
- ba,pt %xcc, 2f
- nop
- .subsection 2
- .globl valid_addr_bitmap_patch
-valid_addr_bitmap_patch:
- sethi %hi(sparc64_valid_addr_bitmap), %g7
- or %g7, %lo(sparc64_valid_addr_bitmap), %g7
- .previous
-
-661: srlx %g5, ILOG2_4MB, %g2
- .section .page_offset_shift_patch, "ax"
- .word 661b
- .previous
-
- srlx %g2, 6, %g5
- and %g2, 63, %g2
- sllx %g5, 3, %g5
- ldx [%g7 + %g5], %g5
- mov 1, %g7
- sllx %g7, %g2, %g7
- andcc %g5, %g7, %g0
- be,pn %xcc, kvmap_dtlb_longpath
-
-2: sethi %hi(kpte_linear_bitmap), %g2
-
- /* Get the 256MB physical address index. */
-661: sllx %g4, 0, %g5
- .section .page_offset_shift_patch, "ax"
- .word 661b
- .previous
-
- or %g2, %lo(kpte_linear_bitmap), %g2
-
-661: srlx %g5, ILOG2_256MB, %g5
- .section .page_offset_shift_patch, "ax"
- .word 661b
- .previous
-
- and %g5, (32 - 1), %g7
-
- /* Divide by 32 to get the offset into the bitmask. */
- srlx %g5, 5, %g5
- add %g7, %g7, %g7
- sllx %g5, 3, %g5
-
- /* kern_linear_pte_xor[(mask >> shift) & 3)] */
- ldx [%g2 + %g5], %g2
- srlx %g2, %g7, %g7
- sethi %hi(kern_linear_pte_xor), %g5
- and %g7, 3, %g7
- or %g5, %lo(kern_linear_pte_xor), %g5
- sllx %g7, 3, %g7
- ldx [%g5 + %g7], %g2
-
.globl kvmap_linear_patch
kvmap_linear_patch:
- ba,pt %xcc, kvmap_dtlb_tsb4m_load
- xor %g2, %g4, %g5
+ ba,a,pt %xcc, kvmap_linear_early

kvmap_dtlb_vmalloc_addr:
KERN_PGTABLE_WALK(%g4, %g5, %g2, kvmap_dtlb_longpath)

TSB_LOCK_TAG(%g1, %g2, %g7)
-
- /* Load and check PTE. */
- ldxa [%g5] ASI_PHYS_USE_EC, %g5
- mov 1, %g7
- sllx %g7, TSB_TAG_INVALID_BIT, %g7
- brgez,a,pn %g5, kvmap_dtlb_longpath
- TSB_STORE(%g1, %g7)
-
TSB_WRITE(%g1, %g5, %g6)

/* fallthrough to TLB load */
--- a/arch/sparc/kernel/vmlinux.lds.S
+++ b/arch/sparc/kernel/vmlinux.lds.S
@@ -122,11 +122,6 @@ SECTIONS
*(.swapper_4m_tsb_phys_patch)
__swapper_4m_tsb_phys_patch_end = .;
}
- .page_offset_shift_patch : {
- __page_offset_shift_patch = .;
- *(.page_offset_shift_patch)
- __page_offset_shift_patch_end = .;
- }
.popc_3insn_patch : {
__popc_3insn_patch = .;
*(.popc_3insn_patch)
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -75,7 +75,6 @@ unsigned long kern_linear_pte_xor[4] __r
* 'cpu' properties, but we need to have this table setup before the
* MDESC is initialized.
*/
-unsigned long kpte_linear_bitmap[KPTE_BITMAP_BYTES / sizeof(unsigned long)];

#ifndef CONFIG_DEBUG_PAGEALLOC
/* A special kernel TSB for 4MB, 256MB, 2GB and 16GB linear mappings.
@@ -84,6 +83,7 @@ unsigned long kpte_linear_bitmap[KPTE_BI
*/
extern struct tsb swapper_4m_tsb[KERNEL_TSB4M_NENTRIES];
#endif
+extern struct tsb swapper_tsb[KERNEL_TSB_NENTRIES];

static unsigned long cpu_pgsz_mask;

@@ -165,10 +165,6 @@ static void __init read_obp_memory(const
cmp_p64, NULL);
}

-unsigned long sparc64_valid_addr_bitmap[VALID_ADDR_BITMAP_BYTES /
- sizeof(unsigned long)];
-EXPORT_SYMBOL(sparc64_valid_addr_bitmap);
-
/* Kernel physical address base and size in bytes. */
unsigned long kern_base __read_mostly;
unsigned long kern_size __read_mostly;
@@ -1369,9 +1365,145 @@ static unsigned long __init bootmem_init
static struct linux_prom64_registers pall[MAX_BANKS] __initdata;
static int pall_ents __initdata;

-#ifdef CONFIG_DEBUG_PAGEALLOC
+static unsigned long max_phys_bits = 40;
+
+bool kern_addr_valid(unsigned long addr)
+{
+ unsigned long above = ((long)addr) >> max_phys_bits;
+ pgd_t *pgd;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ if (above != 0 && above != -1UL)
+ return false;
+
+ if (addr >= (unsigned long) KERNBASE &&
+ addr < (unsigned long)&_end)
+ return true;
+
+ if (addr >= PAGE_OFFSET) {
+ unsigned long pa = __pa(addr);
+
+ return pfn_valid(pa >> PAGE_SHIFT);
+ }
+
+ pgd = pgd_offset_k(addr);
+ if (pgd_none(*pgd))
+ return 0;
+
+ pud = pud_offset(pgd, addr);
+ if (pud_none(*pud))
+ return 0;
+
+ if (pud_large(*pud))
+ return pfn_valid(pud_pfn(*pud));
+
+ pmd = pmd_offset(pud, addr);
+ if (pmd_none(*pmd))
+ return 0;
+
+ if (pmd_large(*pmd))
+ return pfn_valid(pmd_pfn(*pmd));
+
+ pte = pte_offset_kernel(pmd, addr);
+ if (pte_none(*pte))
+ return 0;
+
+ return pfn_valid(pte_pfn(*pte));
+}
+EXPORT_SYMBOL(kern_addr_valid);
+
+static unsigned long __ref kernel_map_hugepud(unsigned long vstart,
+ unsigned long vend,
+ pud_t *pud)
+{
+ const unsigned long mask16gb = (1UL << 34) - 1UL;
+ u64 pte_val = vstart;
+
+ /* Each PUD is 8GB */
+ if ((vstart & mask16gb) ||
+ (vend - vstart <= mask16gb)) {
+ pte_val ^= kern_linear_pte_xor[2];
+ pud_val(*pud) = pte_val | _PAGE_PUD_HUGE;
+
+ return vstart + PUD_SIZE;
+ }
+
+ pte_val ^= kern_linear_pte_xor[3];
+ pte_val |= _PAGE_PUD_HUGE;
+
+ vend = vstart + mask16gb + 1UL;
+ while (vstart < vend) {
+ pud_val(*pud) = pte_val;
+
+ pte_val += PUD_SIZE;
+ vstart += PUD_SIZE;
+ pud++;
+ }
+ return vstart;
+}
+
+static bool kernel_can_map_hugepud(unsigned long vstart, unsigned long vend,
+ bool guard)
+{
+ if (guard && !(vstart & ~PUD_MASK) && (vend - vstart) >= PUD_SIZE)
+ return true;
+
+ return false;
+}
+
+static unsigned long __ref kernel_map_hugepmd(unsigned long vstart,
+ unsigned long vend,
+ pmd_t *pmd)
+{
+ const unsigned long mask256mb = (1UL << 28) - 1UL;
+ const unsigned long mask2gb = (1UL << 31) - 1UL;
+ u64 pte_val = vstart;
+
+ /* Each PMD is 8MB */
+ if ((vstart & mask256mb) ||
+ (vend - vstart <= mask256mb)) {
+ pte_val ^= kern_linear_pte_xor[0];
+ pmd_val(*pmd) = pte_val | _PAGE_PMD_HUGE;
+
+ return vstart + PMD_SIZE;
+ }
+
+ if ((vstart & mask2gb) ||
+ (vend - vstart <= mask2gb)) {
+ pte_val ^= kern_linear_pte_xor[1];
+ pte_val |= _PAGE_PMD_HUGE;
+ vend = vstart + mask256mb + 1UL;
+ } else {
+ pte_val ^= kern_linear_pte_xor[2];
+ pte_val |= _PAGE_PMD_HUGE;
+ vend = vstart + mask2gb + 1UL;
+ }
+
+ while (vstart < vend) {
+ pmd_val(*pmd) = pte_val;
+
+ pte_val += PMD_SIZE;
+ vstart += PMD_SIZE;
+ pmd++;
+ }
+
+ return vstart;
+}
+
+static bool kernel_can_map_hugepmd(unsigned long vstart, unsigned long vend,
+ bool guard)
+{
+ if (guard && !(vstart & ~PMD_MASK) && (vend - vstart) >= PMD_SIZE)
+ return true;
+
+ return false;
+}
+
static unsigned long __ref kernel_map_range(unsigned long pstart,
- unsigned long pend, pgprot_t prot)
+ unsigned long pend, pgprot_t prot,
+ bool use_huge)
{
unsigned long vstart = PAGE_OFFSET + pstart;
unsigned long vend = PAGE_OFFSET + pend;
@@ -1401,15 +1533,23 @@ static unsigned long __ref kernel_map_ra
if (pud_none(*pud)) {
pmd_t *new;

+ if (kernel_can_map_hugepud(vstart, vend, use_huge)) {
+ vstart = kernel_map_hugepud(vstart, vend, pud);
+ continue;
+ }
new = __alloc_bootmem(PAGE_SIZE, PAGE_SIZE, PAGE_SIZE);
alloc_bytes += PAGE_SIZE;
pud_populate(&init_mm, pud, new);
}

pmd = pmd_offset(pud, vstart);
- if (!pmd_present(*pmd)) {
+ if (pmd_none(*pmd)) {
pte_t *new;

+ if (kernel_can_map_hugepmd(vstart, vend, use_huge)) {
+ vstart = kernel_map_hugepmd(vstart, vend, pmd);
+ continue;
+ }
new = __alloc_bootmem(PAGE_SIZE, PAGE_SIZE, PAGE_SIZE);
alloc_bytes += PAGE_SIZE;
pmd_populate_kernel(&init_mm, pmd, new);
@@ -1432,100 +1572,34 @@ static unsigned long __ref kernel_map_ra
return alloc_bytes;
}

-extern unsigned int kvmap_linear_patch[1];
-#endif /* CONFIG_DEBUG_PAGEALLOC */
-
-static void __init kpte_set_val(unsigned long index, unsigned long val)
-{
- unsigned long *ptr = kpte_linear_bitmap;
-
- val <<= ((index % (BITS_PER_LONG / 2)) * 2);
- ptr += (index / (BITS_PER_LONG / 2));
-
- *ptr |= val;
-}
-
-static const unsigned long kpte_shift_min = 28; /* 256MB */
-static const unsigned long kpte_shift_max = 34; /* 16GB */
-static const unsigned long kpte_shift_incr = 3;
-
-static unsigned long kpte_mark_using_shift(unsigned long start, unsigned long end,
- unsigned long shift)
+static void __init flush_all_kernel_tsbs(void)
{
- unsigned long size = (1UL << shift);
- unsigned long mask = (size - 1UL);
- unsigned long remains = end - start;
- unsigned long val;
-
- if (remains < size || (start & mask))
- return start;
-
- /* VAL maps:
- *
- * shift 28 --> kern_linear_pte_xor index 1
- * shift 31 --> kern_linear_pte_xor index 2
- * shift 34 --> kern_linear_pte_xor index 3
- */
- val = ((shift - kpte_shift_min) / kpte_shift_incr) + 1;
-
- remains &= ~mask;
- if (shift != kpte_shift_max)
- remains = size;
-
- while (remains) {
- unsigned long index = start >> kpte_shift_min;
+ int i;

- kpte_set_val(index, val);
+ for (i = 0; i < KERNEL_TSB_NENTRIES; i++) {
+ struct tsb *ent = &swapper_tsb[i];

- start += 1UL << kpte_shift_min;
- remains -= 1UL << kpte_shift_min;
+ ent->tag = (1UL << TSB_TAG_INVALID_BIT);
}
+#ifndef CONFIG_DEBUG_PAGEALLOC
+ for (i = 0; i < KERNEL_TSB4M_NENTRIES; i++) {
+ struct tsb *ent = &swapper_4m_tsb[i];

- return start;
-}
-
-static void __init mark_kpte_bitmap(unsigned long start, unsigned long end)
-{
- unsigned long smallest_size, smallest_mask;
- unsigned long s;
-
- smallest_size = (1UL << kpte_shift_min);
- smallest_mask = (smallest_size - 1UL);
-
- while (start < end) {
- unsigned long orig_start = start;
-
- for (s = kpte_shift_max; s >= kpte_shift_min; s -= kpte_shift_incr) {
- start = kpte_mark_using_shift(start, end, s);
-
- if (start != orig_start)
- break;
- }
-
- if (start == orig_start)
- start = (start + smallest_size) & ~smallest_mask;
+ ent->tag = (1UL << TSB_TAG_INVALID_BIT);
}
+#endif
}

-static void __init init_kpte_bitmap(void)
-{
- unsigned long i;
-
- for (i = 0; i < pall_ents; i++) {
- unsigned long phys_start, phys_end;
-
- phys_start = pall[i].phys_addr;
- phys_end = phys_start + pall[i].reg_size;
-
- mark_kpte_bitmap(phys_start, phys_end);
- }
-}
+extern unsigned int kvmap_linear_patch[1];

static void __init kernel_physical_mapping_init(void)
{
-#ifdef CONFIG_DEBUG_PAGEALLOC
unsigned long i, mem_alloced = 0UL;
+ bool use_huge = true;

+#ifdef CONFIG_DEBUG_PAGEALLOC
+ use_huge = false;
+#endif
for (i = 0; i < pall_ents; i++) {
unsigned long phys_start, phys_end;

@@ -1533,7 +1607,7 @@ static void __init kernel_physical_mappi
phys_end = phys_start + pall[i].reg_size;

mem_alloced += kernel_map_range(phys_start, phys_end,
- PAGE_KERNEL);
+ PAGE_KERNEL, use_huge);
}

printk("Allocated %ld bytes for kernel page tables.\n",
@@ -1542,8 +1616,9 @@ static void __init kernel_physical_mappi
kvmap_linear_patch[0] = 0x01000000; /* nop */
flushi(&kvmap_linear_patch[0]);

+ flush_all_kernel_tsbs();
+
__flush_tlb_all();
-#endif
}

#ifdef CONFIG_DEBUG_PAGEALLOC
@@ -1553,7 +1628,7 @@ void kernel_map_pages(struct page *page,
unsigned long phys_end = phys_start + (numpages * PAGE_SIZE);

kernel_map_range(phys_start, phys_end,
- (enable ? PAGE_KERNEL : __pgprot(0)));
+ (enable ? PAGE_KERNEL : __pgprot(0)), false);

flush_tsb_kernel_range(PAGE_OFFSET + phys_start,
PAGE_OFFSET + phys_end);
@@ -1581,62 +1656,11 @@ unsigned long __init find_ecache_flush_s
unsigned long PAGE_OFFSET;
EXPORT_SYMBOL(PAGE_OFFSET);

-static void __init page_offset_shift_patch_one(unsigned int *insn, unsigned long phys_bits)
-{
- unsigned long final_shift;
- unsigned int val = *insn;
- unsigned int cnt;
-
- /* We are patching in ilog2(max_supported_phys_address), and
- * we are doing so in a manner similar to a relocation addend.
- * That is, we are adding the shift value to whatever value
- * is in the shift instruction count field already.
- */
- cnt = (val & 0x3f);
- val &= ~0x3f;
-
- /* If we are trying to shift >= 64 bits, clear the destination
- * register. This can happen when phys_bits ends up being equal
- * to MAX_PHYS_ADDRESS_BITS.
- */
- final_shift = (cnt + (64 - phys_bits));
- if (final_shift >= 64) {
- unsigned int rd = (val >> 25) & 0x1f;
-
- val = 0x80100000 | (rd << 25);
- } else {
- val |= final_shift;
- }
- *insn = val;
-
- __asm__ __volatile__("flush %0"
- : /* no outputs */
- : "r" (insn));
-}
-
-static void __init page_offset_shift_patch(unsigned long phys_bits)
-{
- extern unsigned int __page_offset_shift_patch;
- extern unsigned int __page_offset_shift_patch_end;
- unsigned int *p;
-
- p = &__page_offset_shift_patch;
- while (p < &__page_offset_shift_patch_end) {
- unsigned int *insn = (unsigned int *)(unsigned long)*p;
-
- page_offset_shift_patch_one(insn, phys_bits);
-
- p++;
- }
-}
-
unsigned long sparc64_va_hole_top = 0xfffff80000000000UL;
unsigned long sparc64_va_hole_bottom = 0x0000080000000000UL;

static void __init setup_page_offset(void)
{
- unsigned long max_phys_bits = 40;
-
if (tlb_type == cheetah || tlb_type == cheetah_plus) {
/* Cheetah/Panther support a full 64-bit virtual
* address, so we can use all that our page tables
@@ -1685,8 +1709,6 @@ static void __init setup_page_offset(voi

pr_info("PAGE_OFFSET is 0x%016lx (max_phys_bits == %lu)\n",
PAGE_OFFSET, max_phys_bits);
-
- page_offset_shift_patch(max_phys_bits);
}

static void __init tsb_phys_patch(void)
@@ -1731,7 +1753,6 @@ static void __init tsb_phys_patch(void)
#define NUM_KTSB_DESCR 1
#endif
static struct hv_tsb_descr ktsb_descr[NUM_KTSB_DESCR];
-extern struct tsb swapper_tsb[KERNEL_TSB_NENTRIES];

/* The swapper TSBs are loaded with a base sequence of:
*
@@ -2030,11 +2051,9 @@ void __init paging_init(void)

pmd = swapper_low_pmd_dir + (shift / sizeof(pmd_t));
pud_set(&swapper_pud_dir[0], pmd);
-
+
inherit_prom_mappings();

- init_kpte_bitmap();
-
/* Ok, we can use our TLB miss and window trap handlers safely. */
setup_tba();

@@ -2141,70 +2160,6 @@ int page_in_phys_avail(unsigned long pad
return 0;
}

-static struct linux_prom64_registers pavail_rescan[MAX_BANKS] __initdata;
-static int pavail_rescan_ents __initdata;
-
-/* Certain OBP calls, such as fetching "available" properties, can
- * claim physical memory. So, along with initializing the valid
- * address bitmap, what we do here is refetch the physical available
- * memory list again, and make sure it provides at least as much
- * memory as 'pavail' does.
- */
-static void __init setup_valid_addr_bitmap_from_pavail(unsigned long *bitmap)
-{
- int i;
-
- read_obp_memory("available", &pavail_rescan[0], &pavail_rescan_ents);
-
- for (i = 0; i < pavail_ents; i++) {
- unsigned long old_start, old_end;
-
- old_start = pavail[i].phys_addr;
- old_end = old_start + pavail[i].reg_size;
- while (old_start < old_end) {
- int n;
-
- for (n = 0; n < pavail_rescan_ents; n++) {
- unsigned long new_start, new_end;
-
- new_start = pavail_rescan[n].phys_addr;
- new_end = new_start +
- pavail_rescan[n].reg_size;
-
- if (new_start <= old_start &&
- new_end >= (old_start + PAGE_SIZE)) {
- set_bit(old_start >> ILOG2_4MB, bitmap);
- goto do_next_page;
- }
- }
-
- prom_printf("mem_init: Lost memory in pavail\n");
- prom_printf("mem_init: OLD start[%lx] size[%lx]\n",
- pavail[i].phys_addr,
- pavail[i].reg_size);
- prom_printf("mem_init: NEW start[%lx] size[%lx]\n",
- pavail_rescan[i].phys_addr,
- pavail_rescan[i].reg_size);
- prom_printf("mem_init: Cannot continue, aborting.\n");
- prom_halt();
-
- do_next_page:
- old_start += PAGE_SIZE;
- }
- }
-}
-
-static void __init patch_tlb_miss_handler_bitmap(void)
-{
- extern unsigned int valid_addr_bitmap_insn[];
- extern unsigned int valid_addr_bitmap_patch[];
-
- valid_addr_bitmap_insn[1] = valid_addr_bitmap_patch[1];
- mb();
- valid_addr_bitmap_insn[0] = valid_addr_bitmap_patch[0];
- flushi(&valid_addr_bitmap_insn[0]);
-}
-
static void __init register_page_bootmem_info(void)
{
#ifdef CONFIG_NEED_MULTIPLE_NODES
@@ -2217,18 +2172,6 @@ static void __init register_page_bootmem
}
void __init mem_init(void)
{
- unsigned long addr, last;
-
- addr = PAGE_OFFSET + kern_base;
- last = PAGE_ALIGN(kern_size) + addr;
- while (addr < last) {
- set_bit(__pa(addr) >> ILOG2_4MB, sparc64_valid_addr_bitmap);
- addr += PAGE_SIZE;
- }
-
- setup_valid_addr_bitmap_from_pavail(sparc64_valid_addr_bitmap);
- patch_tlb_miss_handler_bitmap();
-
high_memory = __va(last_valid_pfn << PAGE_SHIFT);

register_page_bootmem_info();
--- a/arch/sparc/mm/init_64.h
+++ b/arch/sparc/mm/init_64.h
@@ -8,15 +8,8 @@
*/

#define MAX_PHYS_ADDRESS (1UL << MAX_PHYS_ADDRESS_BITS)
-#define KPTE_BITMAP_CHUNK_SZ (256UL * 1024UL * 1024UL)
-#define KPTE_BITMAP_BYTES \
- ((MAX_PHYS_ADDRESS / KPTE_BITMAP_CHUNK_SZ) / 4)
-#define VALID_ADDR_BITMAP_CHUNK_SZ (4UL * 1024UL * 1024UL)
-#define VALID_ADDR_BITMAP_BYTES \
- ((MAX_PHYS_ADDRESS / VALID_ADDR_BITMAP_CHUNK_SZ) / 8)

extern unsigned long kern_linear_pte_xor[4];
-extern unsigned long kpte_linear_bitmap[KPTE_BITMAP_BYTES / sizeof(unsigned long)];
extern unsigned int sparc64_highest_unlocked_tlb_ent;
extern unsigned long sparc64_kern_pri_context;
extern unsigned long sparc64_kern_pri_nuc_bits;

2014-10-28 07:34:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 127/146] sparc64: Fix lockdep warnings on reboot on Ultra-5

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit bdcf81b658ebc4c2640c3c2c55c8b31c601b6996 ]

Inconsistently, the raw_* IRQ routines do not interact with and update
the irqflags tracing and lockdep state, whereas the raw_* spinlock
interfaces do.

This causes problems in p1275_cmd_direct() because we disable hardirqs
by hand using raw_local_irq_restore() and then do a raw_spin_lock()
which triggers a lockdep trace because the CPU's hw IRQ state doesn't
match IRQ tracing's internal software copy of that state.

The CPU's irqs are disabled, yet current->hardirqs_enabled is true.

====================
reboot: Restarting system
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3536 check_flags+0x7c/0x240()
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
Modules linked in: openpromfs
CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G W 3.17.0-dirty #145
Call Trace:
[000000000045919c] warn_slowpath_common+0x5c/0xa0
[0000000000459210] warn_slowpath_fmt+0x30/0x40
[000000000048f41c] check_flags+0x7c/0x240
[0000000000493280] lock_acquire+0x20/0x1c0
[0000000000832b70] _raw_spin_lock+0x30/0x60
[000000000068f2fc] p1275_cmd_direct+0x1c/0x60
[000000000068ed28] prom_reboot+0x28/0x40
[000000000043610c] machine_restart+0x4c/0x80
[000000000047d2d4] kernel_restart+0x54/0x80
[000000000047d618] SyS_reboot+0x138/0x200
[00000000004060b4] linux_sparc_syscall32+0x34/0x60
---[ end trace 5c439fe81c05a100 ]---
possible reason: unannotated irqs-off.
irq event stamp: 2010267
hardirqs last enabled at (2010267): [<000000000049a358>] vprintk_emit+0x4b8/0x580
hardirqs last disabled at (2010266): [<0000000000499f08>] vprintk_emit+0x68/0x580
softirqs last enabled at (2010046): [<000000000045d278>] __do_softirq+0x378/0x4a0
softirqs last disabled at (2010039): [<000000000042bf08>] do_softirq_own_stack+0x28/0x40
Resetting ...
====================

Use local_* variables of the hw IRQ interfaces so that IRQ tracing sees
all of our changes.

Reported-by: Meelis Roos <[email protected]>
Tested-by: Meelis Roos <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/prom/p1275.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

--- a/arch/sparc/prom/p1275.c
+++ b/arch/sparc/prom/p1275.c
@@ -9,6 +9,7 @@
#include <linux/smp.h>
#include <linux/string.h>
#include <linux/spinlock.h>
+#include <linux/irqflags.h>

#include <asm/openprom.h>
#include <asm/oplib.h>
@@ -36,8 +37,8 @@ void p1275_cmd_direct(unsigned long *arg
{
unsigned long flags;

- raw_local_save_flags(flags);
- raw_local_irq_restore((unsigned long)PIL_NMI);
+ local_save_flags(flags);
+ local_irq_restore((unsigned long)PIL_NMI);
raw_spin_lock(&prom_entry_lock);

prom_world(1);
@@ -45,7 +46,7 @@ void p1275_cmd_direct(unsigned long *arg
prom_world(0);

raw_spin_unlock(&prom_entry_lock);
- raw_local_irq_restore(flags);
+ local_irq_restore(flags);
}

void prom_cif_init(void *cif_handler, void *cif_stack)

2014-10-28 07:35:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 112/146] futex: Ensure get_futex_key_refs() always implies a barrier

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Catalin Marinas <[email protected]>

commit 76835b0ebf8a7fe85beb03c75121419a7dec52f0 upstream.

Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
nothing to wake up) changes the futex code to avoid taking a lock when
there are no waiters. This code has been subsequently fixed in commit
11d4616bd07f (futex: revert back to the explicit waiter counting code).
Both the original commit and the fix-up rely on get_futex_key_refs() to
always imply a barrier.

However, for private futexes, none of the cases in the switch statement
of get_futex_key_refs() would be hit and the function completes without
a memory barrier as required before checking the "waiters" in
futex_wake() -> hb_waiters_pending(). The consequence is a race with a
thread waiting on a futex on another CPU, allowing the waker thread to
read "waiters == 0" while the waiter thread to have read "futex_val ==
locked" (in kernel).

Without this fix, the problem (user space deadlocks) can be seen with
Android bionic's mutex implementation on an arm64 multi-cluster system.

Signed-off-by: Catalin Marinas <[email protected]>
Reported-by: Matteo Franchin <[email protected]>
Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
Acked-by: Davidlohr Bueso <[email protected]>
Tested-by: Mike Galbraith <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/futex.c | 2 ++
1 file changed, 2 insertions(+)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -343,6 +343,8 @@ static void get_futex_key_refs(union fut
case FUT_OFF_MMSHARED:
futex_get_mm(key); /* implies MB (B) */
break;
+ default:
+ smp_mb(); /* explicit MB (B) */
}
}


2014-10-28 03:40:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 106/146] mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Junxiao Bi <[email protected]>

commit 934f3072c17cc8886f4c043b47eeeb1b12f8de33 upstream.

commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O
during memory allocation") introduces PF_MEMALLOC_NOIO flag to avoid doing
I/O inside memory allocation, __GFP_IO is cleared when this flag is set,
but __GFP_FS implies __GFP_IO, it should also be cleared. Or it may still
run into I/O, like in superblock shrinker. And this will make the kernel
run into the deadlock case described in that commit.

See Dave Chinner's comment about io in superblock shrinker:

Filesystem shrinkers do indeed perform IO from the superblock shrinker and
have for years. Even clean inodes can require IO before they can be freed
- e.g. on an orphan list, need truncation of post-eof blocks, need to
wait for ordered operations to complete before it can be freed, etc.

IOWs, Ext4, btrfs and XFS all can issue and/or block on arbitrary amounts
of IO in the superblock shrinker context. XFS, in particular, has been
doing transactions and IO from the VFS inode cache shrinker since it was
first introduced....

Fix this by clearing __GFP_FS in memalloc_noio_flags(), this function has
masked all the gfp_mask that will be passed into fs for the processes
setting PF_MEMALLOC_NOIO in the direct reclaim path.

v1 thread at: https://lkml.org/lkml/2014/9/3/32

Signed-off-by: Junxiao Bi <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: joyce.xue <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Trond Myklebust <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/sched.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1934,11 +1934,13 @@ extern void thread_group_cputime_adjuste
#define tsk_used_math(p) ((p)->flags & PF_USED_MATH)
#define used_math() tsk_used_math(current)

-/* __GFP_IO isn't allowed if PF_MEMALLOC_NOIO is set in current->flags */
+/* __GFP_IO isn't allowed if PF_MEMALLOC_NOIO is set in current->flags
+ * __GFP_FS is also cleared as it implies __GFP_IO.
+ */
static inline gfp_t memalloc_noio_flags(gfp_t flags)
{
if (unlikely(current->flags & PF_MEMALLOC_NOIO))
- flags &= ~__GFP_IO;
+ flags &= ~(__GFP_IO | __GFP_FS);
return flags;
}


2014-10-28 07:36:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 125/146] sparc: Let memset return the address argument

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andreas Larsson <[email protected]>

[ Upstream commit 74cad25c076a2f5253312c2fe82d1a4daecc1323 ]

This makes memset follow the standard (instead of returning 0 on success). This
is needed when certain versions of gcc optimizes around memset calls and assume
that the address argument is preserved in %o0.

Signed-off-by: Andreas Larsson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/lib/memset.S | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

--- a/arch/sparc/lib/memset.S
+++ b/arch/sparc/lib/memset.S
@@ -3,8 +3,9 @@
* Copyright (C) 1996,1997 Jakub Jelinek ([email protected])
* Copyright (C) 1996 David S. Miller ([email protected])
*
- * Returns 0, if ok, and number of bytes not yet set if exception
- * occurs and we were called as clear_user.
+ * Calls to memset returns initial %o0. Calls to bzero returns 0, if ok, and
+ * number of bytes not yet set if exception occurs and we were called as
+ * clear_user.
*/

#include <asm/ptrace.h>
@@ -65,6 +66,8 @@ __bzero_begin:
.globl __memset_start, __memset_end
__memset_start:
memset:
+ mov %o0, %g1
+ mov 1, %g4
and %o1, 0xff, %g3
sll %g3, 8, %g2
or %g3, %g2, %g3
@@ -89,6 +92,7 @@ memset:
sub %o0, %o2, %o0

__bzero:
+ clr %g4
mov %g0, %g3
1:
cmp %o1, 7
@@ -151,8 +155,8 @@ __bzero:
bne,a 8f
EX(stb %g3, [%o0], and %o1, 1)
8:
- retl
- clr %o0
+ b 0f
+ nop
7:
be 13b
orcc %o1, 0, %g0
@@ -164,6 +168,12 @@ __bzero:
bne 8b
EX(stb %g3, [%o0 - 1], add %o1, 1)
0:
+ andcc %g4, 1, %g0
+ be 5f
+ nop
+ retl
+ mov %g1, %o0
+5:
retl
clr %o0
__memset_end:

2014-10-28 03:40:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 122/146] sparc64: Fix corrupted thread fault code.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <[email protected]>

[ Upstream commit 84bd6d8b9c0f06b3f188efb479c77e20f05e9a8a ]

Every path that ends up at do_sparc64_fault() must install a valid
FAULT_CODE_* bitmask in the per-thread fault code byte.

Two paths leading to the label winfix_trampoline (which expects the
FAULT_CODE_* mask in register %g4) were not doing so:

1) For pre-hypervisor TLB protection violation traps, if we took
the 'winfix_trampoline' path we wouldn't have %g4 initialized
with the FAULT_CODE_* value yet. Resulting in using the
TLB_TAG_ACCESS register address value instead.

2) In the TSB miss path, when we notice that we are going to use a
hugepage mapping, but we haven't allocated the hugepage TSB yet, we
still have to take the window fixup case into consideration and
in that particular path we leave %g4 not setup properly.

Errors on this sort were largely invisible previously, but after
commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 ("sparc64: sun4v TLB
error power off events") we now have a fault_code mask bit
(FAULT_CODE_BAD_RA) that triggers due to this bug.

FAULT_CODE_BAD_RA triggers because this bit is set in TLB_TAG_ACCESS
(see #1 above) and thus we get seemingly random bus errors triggered
for user processes.

Fixes: 4ccb9272892c ("sparc64: sun4v TLB error power off events")
Reported-by: Meelis Roos <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/dtlb_prot.S | 6 +++---
arch/sparc/kernel/tsb.S | 6 +++---
2 files changed, 6 insertions(+), 6 deletions(-)

--- a/arch/sparc/kernel/dtlb_prot.S
+++ b/arch/sparc/kernel/dtlb_prot.S
@@ -24,11 +24,11 @@
mov TLB_TAG_ACCESS, %g4 ! For reload of vaddr

/* PROT ** ICACHE line 2: More real fault processing */
+ ldxa [%g4] ASI_DMMU, %g5 ! Put tagaccess in %g5
bgu,pn %xcc, winfix_trampoline ! Yes, perform winfixup
- ldxa [%g4] ASI_DMMU, %g5 ! Put tagaccess in %g5
- ba,pt %xcc, sparc64_realfault_common ! Nope, normal fault
mov FAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4
- nop
+ ba,pt %xcc, sparc64_realfault_common ! Nope, normal fault
+ nop
nop
nop
nop
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -162,10 +162,10 @@ tsb_miss_page_table_walk_sun4v_fastpath:
nop
.previous

- rdpr %tl, %g3
- cmp %g3, 1
+ rdpr %tl, %g7
+ cmp %g7, 1
bne,pn %xcc, winfix_trampoline
- nop
+ mov %g3, %g4
ba,pt %xcc, etrap
rd %pc, %g7
call hugetlb_setup

2014-10-28 03:40:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 105/146] Bluetooth: 6lowpan: Route packets that are not meant to peer via correct device

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jukka Rissanen <[email protected]>

commit 39e90c77637b3892a39f2908aea57539e961c50e upstream.

Packets that are supposed to be delivered via the peer device need to
be checked and sent to correct device. This requires that user has set
the routes properly so that the 6lowpan module can then figure out
the destination gateway and the correct Bluetooth device.

Signed-off-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/bluetooth/6lowpan.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 63 insertions(+), 2 deletions(-)

--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -39,6 +39,7 @@ static struct dentry *lowpan_control_deb

struct skb_cb {
struct in6_addr addr;
+ struct in6_addr gw;
struct l2cap_chan *chan;
int status;
};
@@ -158,6 +159,54 @@ static inline struct lowpan_peer *peer_l
return NULL;
}

+static inline struct lowpan_peer *peer_lookup_dst(struct lowpan_dev *dev,
+ struct in6_addr *daddr,
+ struct sk_buff *skb)
+{
+ struct lowpan_peer *peer, *tmp;
+ struct in6_addr *nexthop;
+ struct rt6_info *rt = (struct rt6_info *)skb_dst(skb);
+ int count = atomic_read(&dev->peer_count);
+
+ BT_DBG("peers %d addr %pI6c rt %p", count, daddr, rt);
+
+ /* If we have multiple 6lowpan peers, then check where we should
+ * send the packet. If only one peer exists, then we can send the
+ * packet right away.
+ */
+ if (count == 1)
+ return list_first_entry(&dev->peers, struct lowpan_peer,
+ list);
+
+ if (!rt) {
+ nexthop = &lowpan_cb(skb)->gw;
+
+ if (ipv6_addr_any(nexthop))
+ return NULL;
+ } else {
+ nexthop = rt6_nexthop(rt);
+
+ /* We need to remember the address because it is needed
+ * by bt_xmit() when sending the packet. In bt_xmit(), the
+ * destination routing info is not set.
+ */
+ memcpy(&lowpan_cb(skb)->gw, nexthop, sizeof(struct in6_addr));
+ }
+
+ BT_DBG("gw %pI6c", nexthop);
+
+ list_for_each_entry_safe(peer, tmp, &dev->peers, list) {
+ BT_DBG("dst addr %pMR dst type %d ip %pI6c",
+ &peer->chan->dst, peer->chan->dst_type,
+ &peer->peer_addr);
+
+ if (!ipv6_addr_cmp(&peer->peer_addr, nexthop))
+ return peer;
+ }
+
+ return NULL;
+}
+
static struct lowpan_peer *lookup_peer(struct l2cap_conn *conn)
{
struct lowpan_dev *entry, *tmp;
@@ -415,8 +464,18 @@ static int header_create(struct sk_buff
read_unlock_irqrestore(&devices_lock, flags);

if (!peer) {
- BT_DBG("no such peer %pMR found", &addr);
- return -ENOENT;
+ /* The packet might be sent to 6lowpan interface
+ * because of routing (either via default route
+ * or user set route) so get peer according to
+ * the destination address.
+ */
+ read_lock_irqsave(&devices_lock, flags);
+ peer = peer_lookup_dst(dev, &hdr->daddr, skb);
+ read_unlock_irqrestore(&devices_lock, flags);
+ if (!peer) {
+ BT_DBG("no such peer %pMR found", &addr);
+ return -ENOENT;
+ }
}

daddr = peer->eui64_addr;
@@ -520,6 +579,8 @@ static netdev_tx_t bt_xmit(struct sk_buf

read_lock_irqsave(&devices_lock, flags);
peer = peer_lookup_ba(dev, &addr, addr_type);
+ if (!peer)
+ peer = peer_lookup_dst(dev, &lowpan_cb(skb)->addr, skb);
read_unlock_irqrestore(&devices_lock, flags);

BT_DBG("xmit %s to %pMR type %d IP %pI6c peer %p",

2014-10-28 07:37:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 123/146] sparc64: find_node adjustment

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: bob picco <[email protected]>

[ Upstream commit 3dee9df54836d5f844f3d58281d3f3e6331b467f ]

We have seen an issue with guest boot into LDOM that causes early boot failures
because of no matching rules for node identitity of the memory. I analyzed this
on my T4 and concluded there might not be a solution. I saw the issue in
mainline too when booting into the control/primary domain - with guests
configured. Note, this could be a firmware bug on some older machines.

I'll provide a full explanation of the issues below. Should we not find a
matching BEST latency group for a real address (RA) then we will assume node 0.
On the T4-2 here with the information provided I can't see an alternative.

Technically the LDOM shown below should match the MBLOCK to the
favorable latency group. However other factors must be considered too. Were
the memory controllers configured "fine" grained interleave or "coarse"
grain interleaved - T4. Also should a "group" MD node be considered a NUMA
node?

There has to be at least one Machine Description (MD) "group" and hence one
NUMA node. The group can have one or more latency groups (lg) - more than one
memory controller. The current code chooses the smallest latency as the most
favorable per group. The latency and lg information is in MLGROUP below.
MBLOCK is the base and size of the RAs for the machine as fetched from OBP
/memory "available" property. My machine has one MBLOCK but more would be
possible - with holes?

For a T4-2 the following information has been gathered:
with LDOM guest
MEMBLOCK configuration:
memory size = 0x27f870000
memory.cnt = 0x3
memory[0x0] [0x00000020400000-0x0000029fc67fff], 0x27f868000 bytes
memory[0x1] [0x0000029fd8a000-0x0000029fd8bfff], 0x2000 bytes
memory[0x2] [0x0000029fd92000-0x0000029fd97fff], 0x6000 bytes
reserved.cnt = 0x2
reserved[0x0] [0x00000020800000-0x000000216c15c0], 0xec15c1 bytes
reserved[0x1] [0x00000024800000-0x0000002c180c1e], 0x7980c1f bytes
MBLOCK[0]: base[20000000] size[280000000] offset[0]
(note: "base" and "size" reported in "MBLOCK" encompass the "memory[X]" values)
(note: (RA + offset) & mask = val is the formula to detect a match for the
memory controller. should there be no match for find_node node, a return
value of -1 resulted for the node - BAD)

There is one group. It has these forward links
MLGROUP[1]: node[545] latency[1f7e8] match[200000000] mask[200000000]
MLGROUP[2]: node[54d] latency[2de60] match[0] mask[200000000]
NUMA NODE[0]: node[545] mask[200000000] val[200000000] (latency[1f7e8])
(note: "val" is the best lg's (smallest latency) "match")

no LDOM guest - bare metal
MEMBLOCK configuration:
memory size = 0xfdf2d0000
memory.cnt = 0x3
memory[0x0] [0x00000020400000-0x00000fff6adfff], 0xfdf2ae000 bytes
memory[0x1] [0x00000fff6d2000-0x00000fff6e7fff], 0x16000 bytes
memory[0x2] [0x00000fff766000-0x00000fff771fff], 0xc000 bytes
reserved.cnt = 0x2
reserved[0x0] [0x00000020800000-0x00000021a04580], 0x1204581 bytes
reserved[0x1] [0x00000024800000-0x0000002c7d29fc], 0x7fd29fd bytes
MBLOCK[0]: base[20000000] size[fe0000000] offset[0]

there are two groups
group node[16d5]
MLGROUP[0]: node[1765] latency[1f7e8] match[0] mask[200000000]
MLGROUP[3]: node[177d] latency[2de60] match[200000000] mask[200000000]
NUMA NODE[0]: node[1765] mask[200000000] val[0] (latency[1f7e8])
group node[171d]
MLGROUP[2]: node[1775] latency[2de60] match[0] mask[200000000]
MLGROUP[1]: node[176d] latency[1f7e8] match[200000000] mask[200000000]
NUMA NODE[1]: node[176d] mask[200000000] val[200000000] (latency[1f7e8])
(note: for this two "group" bare metal machine, 1/2 memory is in group one's
lg and 1/2 memory is in group two's lg).

Cc: [email protected]
Signed-off-by: Bob Picco <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/mm/init_64.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -840,7 +840,10 @@ static int find_node(unsigned long addr)
if ((addr & p->mask) == p->val)
return i;
}
- return -1;
+ /* The following condition has been observed on LDOM guests.*/
+ WARN_ONCE(1, "find_node: A physical address doesn't match a NUMA node"
+ " rule. Some physical memory will be owned by node 0.");
+ return 0;
}

static u64 memblock_nid_range(u64 start, u64 end, int *nid)

2014-10-28 07:38:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 121/146] sparc64: sun4v TLB error power off events

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: bob picco <[email protected]>

[ Upstream commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 ]

We've witnessed a few TLB events causing the machine to power off because
of prom_halt. In one case it was some nfs related area during rmmod. Another
was an mmapper of /dev/mem. A more recent one is an ITLB issue with
a bad pagesize which could be a hardware bug. Bugs happen but we should
attempt to not power off the machine and/or hang it when possible.

This is a DTLB error from an mmapper of /dev/mem:
[root@sparcie ~]# SUN4V-DTLB: Error at TPC[fffff80100903e6c], tl 1
SUN4V-DTLB: TPC<0xfffff80100903e6c>
SUN4V-DTLB: O7[fffff801081979d0]
SUN4V-DTLB: O7<0xfffff801081979d0>
SUN4V-DTLB: vaddr[fffff80100000000] ctx[1250] pte[98000000000f0610] error[2]
.

This is recent mainline for ITLB:
[ 3708.179864] SUN4V-ITLB: TPC<0xfffffc010071cefc>
[ 3708.188866] SUN4V-ITLB: O7[fffffc010071cee8]
[ 3708.197377] SUN4V-ITLB: O7<0xfffffc010071cee8>
[ 3708.206539] SUN4V-ITLB: vaddr[e0003] ctx[1a3c] pte[2900000dcc800eeb] error[4]
.

Normally sun4v_itlb_error_report() and sun4v_dtlb_error_report() would call
prom_halt() and drop us to OF command prompt "ok". This isn't the case for
LDOMs and the machine powers off.

For the HV reported error of HV_ENORADDR for HV HV_MMU_MAP_ADDR_TRAP we cause
a SIGBUS error by qualifying it within do_sparc64_fault() for fault code mask
of FAULT_CODE_BAD_RA. This is done when trap level (%tl) is less or equal
one("1"). Otherwise, for %tl > 1, we proceed eventually to die_if_kernel().

The logic of this patch was partially inspired by David Miller's feedback.

Power off of large sparc64 machines is painful. Plus die_if_kernel provides
more context. A reset sequence isn't a brief period on large sparc64 but
better than power-off/power-on sequence.

Cc: [email protected]
Signed-off-by: Bob Picco <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/thread_info_64.h | 1
arch/sparc/kernel/sun4v_tlb_miss.S | 35 +++++++++++++++++++-------------
arch/sparc/kernel/traps_64.c | 15 ++++++++-----
arch/sparc/mm/fault_64.c | 3 ++
4 files changed, 34 insertions(+), 20 deletions(-)

--- a/arch/sparc/include/asm/thread_info_64.h
+++ b/arch/sparc/include/asm/thread_info_64.h
@@ -102,6 +102,7 @@ struct thread_info {
#define FAULT_CODE_ITLB 0x04 /* Miss happened in I-TLB */
#define FAULT_CODE_WINFIXUP 0x08 /* Miss happened during spill/fill */
#define FAULT_CODE_BLKCOMMIT 0x10 /* Use blk-commit ASI in copy_page */
+#define FAULT_CODE_BAD_RA 0x20 /* Bad RA for sun4v */

#if PAGE_SHIFT == 13
#define THREAD_SIZE (2*PAGE_SIZE)
--- a/arch/sparc/kernel/sun4v_tlb_miss.S
+++ b/arch/sparc/kernel/sun4v_tlb_miss.S
@@ -195,6 +195,11 @@ sun4v_tsb_miss_common:
ldx [%g2 + TRAP_PER_CPU_PGD_PADDR], %g7

sun4v_itlb_error:
+ rdpr %tl, %g1
+ cmp %g1, 1
+ ble,pt %icc, sun4v_bad_ra
+ or %g0, FAULT_CODE_BAD_RA | FAULT_CODE_ITLB, %g1
+
sethi %hi(sun4v_err_itlb_vaddr), %g1
stx %g4, [%g1 + %lo(sun4v_err_itlb_vaddr)]
sethi %hi(sun4v_err_itlb_ctx), %g1
@@ -206,15 +211,10 @@ sun4v_itlb_error:
sethi %hi(sun4v_err_itlb_error), %g1
stx %o0, [%g1 + %lo(sun4v_err_itlb_error)]

+ sethi %hi(1f), %g7
rdpr %tl, %g4
- cmp %g4, 1
- ble,pt %icc, 1f
- sethi %hi(2f), %g7
ba,pt %xcc, etraptl1
- or %g7, %lo(2f), %g7
-
-1: ba,pt %xcc, etrap
-2: or %g7, %lo(2b), %g7
+1: or %g7, %lo(1f), %g7
mov %l4, %o1
call sun4v_itlb_error_report
add %sp, PTREGS_OFF, %o0
@@ -222,6 +222,11 @@ sun4v_itlb_error:
/* NOTREACHED */

sun4v_dtlb_error:
+ rdpr %tl, %g1
+ cmp %g1, 1
+ ble,pt %icc, sun4v_bad_ra
+ or %g0, FAULT_CODE_BAD_RA | FAULT_CODE_DTLB, %g1
+
sethi %hi(sun4v_err_dtlb_vaddr), %g1
stx %g4, [%g1 + %lo(sun4v_err_dtlb_vaddr)]
sethi %hi(sun4v_err_dtlb_ctx), %g1
@@ -233,21 +238,23 @@ sun4v_dtlb_error:
sethi %hi(sun4v_err_dtlb_error), %g1
stx %o0, [%g1 + %lo(sun4v_err_dtlb_error)]

+ sethi %hi(1f), %g7
rdpr %tl, %g4
- cmp %g4, 1
- ble,pt %icc, 1f
- sethi %hi(2f), %g7
ba,pt %xcc, etraptl1
- or %g7, %lo(2f), %g7
-
-1: ba,pt %xcc, etrap
-2: or %g7, %lo(2b), %g7
+1: or %g7, %lo(1f), %g7
mov %l4, %o1
call sun4v_dtlb_error_report
add %sp, PTREGS_OFF, %o0

/* NOTREACHED */

+sun4v_bad_ra:
+ or %g0, %g4, %g5
+ ba,pt %xcc, sparc64_realfault_common
+ or %g1, %g0, %g4
+
+ /* NOTREACHED */
+
/* Instruction Access Exception, tl0. */
sun4v_iacc:
ldxa [%g0] ASI_SCRATCHPAD, %g2
--- a/arch/sparc/kernel/traps_64.c
+++ b/arch/sparc/kernel/traps_64.c
@@ -2104,6 +2104,11 @@ void sun4v_nonresum_overflow(struct pt_r
atomic_inc(&sun4v_nonresum_oflow_cnt);
}

+static void sun4v_tlb_error(struct pt_regs *regs)
+{
+ die_if_kernel("TLB/TSB error", regs);
+}
+
unsigned long sun4v_err_itlb_vaddr;
unsigned long sun4v_err_itlb_ctx;
unsigned long sun4v_err_itlb_pte;
@@ -2111,8 +2116,7 @@ unsigned long sun4v_err_itlb_error;

void sun4v_itlb_error_report(struct pt_regs *regs, int tl)
{
- if (tl > 1)
- dump_tl1_traplog((struct tl1_traplog *)(regs + 1));
+ dump_tl1_traplog((struct tl1_traplog *)(regs + 1));

printk(KERN_EMERG "SUN4V-ITLB: Error at TPC[%lx], tl %d\n",
regs->tpc, tl);
@@ -2125,7 +2129,7 @@ void sun4v_itlb_error_report(struct pt_r
sun4v_err_itlb_vaddr, sun4v_err_itlb_ctx,
sun4v_err_itlb_pte, sun4v_err_itlb_error);

- prom_halt();
+ sun4v_tlb_error(regs);
}

unsigned long sun4v_err_dtlb_vaddr;
@@ -2135,8 +2139,7 @@ unsigned long sun4v_err_dtlb_error;

void sun4v_dtlb_error_report(struct pt_regs *regs, int tl)
{
- if (tl > 1)
- dump_tl1_traplog((struct tl1_traplog *)(regs + 1));
+ dump_tl1_traplog((struct tl1_traplog *)(regs + 1));

printk(KERN_EMERG "SUN4V-DTLB: Error at TPC[%lx], tl %d\n",
regs->tpc, tl);
@@ -2149,7 +2152,7 @@ void sun4v_dtlb_error_report(struct pt_r
sun4v_err_dtlb_vaddr, sun4v_err_dtlb_ctx,
sun4v_err_dtlb_pte, sun4v_err_dtlb_error);

- prom_halt();
+ sun4v_tlb_error(regs);
}

void hypervisor_tlbop_error(unsigned long err, unsigned long op)
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -346,6 +346,9 @@ retry:
down_read(&mm->mmap_sem);
}

+ if (fault_code & FAULT_CODE_BAD_RA)
+ goto do_sigbus;
+
vma = find_vma(mm, address);
if (!vma)
goto bad_area;

2014-10-28 03:40:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 117/146] ima: fix fallback to use new_sync_read()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Kasatkin <[email protected]>

commit 27cd1fc3ae5374a4a86662c67033f15ef27b2461 upstream.

3.16 commit aad4f8bb42af06371aa0e85bf0cd9d52c0494985
'switch simple generic_file_aio_read() users to ->read_iter()'
replaced ->aio_read with ->read_iter in most of the file systems
and introduced new_sync_read() as a replacement for do_sync_read().

Most of file systems set '->read' and ima_kernel_read is not affected.
When ->read is not set, this patch adopts fallback call changes from the
vfs_read.

Signed-off-by: Dmitry Kasatkin <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/integrity/ima/ima_crypto.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/security/integrity/ima/ima_crypto.c
+++ b/security/integrity/ima/ima_crypto.c
@@ -80,19 +80,19 @@ static int ima_kernel_read(struct file *
{
mm_segment_t old_fs;
char __user *buf = addr;
- ssize_t ret;
+ ssize_t ret = -EINVAL;

if (!(file->f_mode & FMODE_READ))
return -EBADF;
- if (!file->f_op->read && !file->f_op->aio_read)
- return -EINVAL;

old_fs = get_fs();
set_fs(get_ds());
if (file->f_op->read)
ret = file->f_op->read(file, buf, count, &offset);
- else
+ else if (file->f_op->aio_read)
ret = do_sync_read(file, buf, count, &offset);
+ else if (file->f_op->read_iter)
+ ret = new_sync_read(file, buf, count, &offset);
set_fs(old_fs);
return ret;
}

2014-10-28 07:38:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 118/146] ima: provide flag to identify new empty files

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Kasatkin <[email protected]>

commit b151d6b00bbb798c58f2f21305e7d43fa763f34f upstream.

On ima_file_free(), newly created empty files are not labeled with
an initial security.ima value, because the iversion did not change.
Commit dff6efc "fs: fix iversion handling" introduced a change in
iversion behavior. To verify this change use the shell command:

$ (exec >foo)
$ getfattr -h -e hex -d -m security foo

This patch defines the IMA_NEW_FILE flag. The flag is initially
set, when IMA detects that a new file is created, and subsequently
checked on the ima_file_free() hook to set the initial security.ima
value.

Signed-off-by: Dmitry Kasatkin <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/integrity/ima/ima_appraise.c | 7 +++++--
security/integrity/ima/ima_main.c | 12 +++++++-----
security/integrity/integrity.h | 1 +
3 files changed, 13 insertions(+), 7 deletions(-)

--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -202,8 +202,11 @@ int ima_appraise_measurement(int func, s
goto out;

cause = "missing-hash";
- status =
- (inode->i_size == 0) ? INTEGRITY_PASS : INTEGRITY_NOLABEL;
+ status = INTEGRITY_NOLABEL;
+ if (inode->i_size == 0) {
+ iint->flags |= IMA_NEW_FILE;
+ status = INTEGRITY_PASS;
+ }
goto out;
}

--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -124,11 +124,13 @@ static void ima_check_last_writer(struct
return;

mutex_lock(&inode->i_mutex);
- if (atomic_read(&inode->i_writecount) == 1 &&
- iint->version != inode->i_version) {
- iint->flags &= ~IMA_DONE_MASK;
- if (iint->flags & IMA_APPRAISE)
- ima_update_xattr(iint, file);
+ if (atomic_read(&inode->i_writecount) == 1) {
+ if ((iint->version != inode->i_version) ||
+ (iint->flags & IMA_NEW_FILE)) {
+ iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
+ if (iint->flags & IMA_APPRAISE)
+ ima_update_xattr(iint, file);
+ }
}
mutex_unlock(&inode->i_mutex);
}
--- a/security/integrity/integrity.h
+++ b/security/integrity/integrity.h
@@ -31,6 +31,7 @@
#define IMA_DIGSIG 0x01000000
#define IMA_DIGSIG_REQUIRED 0x02000000
#define IMA_PERMIT_DIRECTIO 0x04000000
+#define IMA_NEW_FILE 0x08000000

#define IMA_DO_MASK (IMA_MEASURE | IMA_APPRAISE | IMA_AUDIT | \
IMA_APPRAISE_SUBMASK)

2014-10-28 03:40:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 104/146] Bluetooth: 6lowpan: Set the peer IPv6 address correctly

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jukka Rissanen <[email protected]>

commit b2799cec22812f5f1aaaa57133df51876f685d84 upstream.

The peer IPv6 address contained wrong U/L bit in the EUI-64 part.

Signed-off-by: Jukka Rissanen <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/bluetooth/6lowpan.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -671,6 +671,14 @@ static struct l2cap_chan *chan_open(stru
return chan;
}

+static void set_ip_addr_bits(u8 addr_type, u8 *addr)
+{
+ if (addr_type == BDADDR_LE_PUBLIC)
+ *addr |= 0x02;
+ else
+ *addr &= ~0x02;
+}
+
static struct l2cap_chan *add_peer_chan(struct l2cap_chan *chan,
struct lowpan_dev *dev)
{
@@ -693,6 +701,11 @@ static struct l2cap_chan *add_peer_chan(
memcpy(&peer->eui64_addr, (u8 *)&peer->peer_addr.s6_addr + 8,
EUI64_ADDR_LEN);

+ /* IPv6 address needs to have the U/L bit set properly so toggle
+ * it back here.
+ */
+ set_ip_addr_bits(chan->dst_type, (u8 *)&peer->peer_addr.s6_addr + 8);
+
write_lock_irqsave(&devices_lock, flags);
INIT_LIST_HEAD(&peer->list);
peer_add(dev, peer);

2014-10-28 07:39:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 116/146] powerpc/eeh: Clear frozen device state in time

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gavin Shan <[email protected]>

commit 22fca17924094113fe79c1db5135290e1a84ad4b upstream.

The problem was reported by Carol: In the scenario of passing mlx4
adapter to guest, EEH error could be recovered successfully. When
returning the device back to host, the driver (mlx4_core.ko)
couldn't be loaded successfully because of error number -5 (-EIO)
returned from mlx4_get_ownership(), which hits offlined PCI device.
The root cause is that we missed to put the affected devices into
normal state on clearing PE isolated state right after PE reset.

The patch fixes above issue by putting the affected devices to
normal state when clearing PE isolated state in eeh_pe_state_clear().

Reported-by: Carol L. Soto <[email protected]>
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/eeh_pe.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)

--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -584,6 +584,8 @@ static void *__eeh_pe_state_clear(void *
{
struct eeh_pe *pe = (struct eeh_pe *)data;
int state = *((int *)flag);
+ struct eeh_dev *edev, *tmp;
+ struct pci_dev *pdev;

/* Keep the state of permanently removed PE intact */
if ((pe->freeze_count > EEH_MAX_ALLOWED_FREEZES) &&
@@ -592,9 +594,22 @@ static void *__eeh_pe_state_clear(void *

pe->state &= ~state;

- /* Clear check count since last isolation */
- if (state & EEH_PE_ISOLATED)
- pe->check_count = 0;
+ /*
+ * Special treatment on clearing isolated state. Clear
+ * check count since last isolation and put all affected
+ * devices to normal state.
+ */
+ if (!(state & EEH_PE_ISOLATED))
+ return NULL;
+
+ pe->check_count = 0;
+ eeh_pe_for_each_dev(pe, edev, tmp) {
+ pdev = eeh_dev_to_pci_dev(edev);
+ if (!pdev)
+ continue;
+
+ pdev->error_state = pci_channel_io_normal;
+ }

return NULL;
}

2014-10-28 03:40:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 113/146] powerpc: Fix warning reported by verify_cpu_node_mapping()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Li Zhong <[email protected]>

commit 70ad237515d99595ed03848bd8e549e50e83c4f2 upstream.

With commit 2fabf084b6ad ("powerpc: reorder per-cpu NUMA information's
initialization"), during boottime, cpu_numa_callback() is called
earlier(before their online) for each cpu, and verify_cpu_node_mapping()
uses cpu_to_node() to check whether siblings are in the same node.

It skips the checking for siblings that are not online yet. So the only
check done here is for the bootcpu, which is online at that time. But
the per-cpu numa_node cpu_to_node() uses hasn't been set up yet (which
will be set up in smp_prepare_cpus()).

So I saw something like following reported:
[ 0.000000] CPU thread siblings 1/2/3 and 0 don't belong to the same
node!

As we don't actually do the checking during this early stage, so maybe
we could directly call numa_setup_cpu() in do_init_bootmem().

Cc: Nishanth Aravamudan <[email protected]>
Cc: Nathan Fontenot <[email protected]>
Signed-off-by: Li Zhong <[email protected]>
Acked-by: Nishanth Aravamudan <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/mm/numa.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1128,8 +1128,7 @@ void __init do_init_bootmem(void)
* early in boot, cf. smp_prepare_cpus().
*/
for_each_possible_cpu(cpu) {
- cpu_numa_callback(&ppc64_numa_nb, CPU_UP_PREPARE,
- (void *)(unsigned long)cpu);
+ numa_setup_cpu((unsigned long)cpu);
}
}


2014-10-28 07:39:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 115/146] powerpc/iommu/ddw: Fix endianness

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

commit 9410e0185e65394c0c6d046033904b53b97a9423 upstream.

rtas_call() accepts and returns values in CPU endianness.
The ddw_query_response and ddw_create_response structs members are
defined and treated as BE but as they are passed to rtas_call() as
(u32 *) and they get byteswapped automatically, the data is CPU-endian.
This fixes ddw_query_response and ddw_create_response definitions and use.

of_read_number() is designed to work with device tree cells - it assumes
the input is big-endian and returns data in CPU-endian. However due
to the ddw_create_response struct fix, create.addr_hi/lo are already
CPU-endian so do not byteswap them.

ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains
3 cells which are big-endian as it is a device tree. rtas_call() accepts
a RTAS token in CPU-endian. This makes use of of_property_read_u32_array
to byte swap and avoid the need for a number of be32_to_cpu calls.

Cc: Benjamin Herrenschmidt <[email protected]>
[aik: folded Anton's patch with of_property_read_u32_array]
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Acked-by: Anton Blanchard <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/platforms/pseries/iommu.c | 51 ++++++++++++++++++---------------
1 file changed, 28 insertions(+), 23 deletions(-)

--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -329,16 +329,16 @@ struct direct_window {

/* Dynamic DMA Window support */
struct ddw_query_response {
- __be32 windows_available;
- __be32 largest_available_block;
- __be32 page_size;
- __be32 migration_capable;
+ u32 windows_available;
+ u32 largest_available_block;
+ u32 page_size;
+ u32 migration_capable;
};

struct ddw_create_response {
- __be32 liobn;
- __be32 addr_hi;
- __be32 addr_lo;
+ u32 liobn;
+ u32 addr_hi;
+ u32 addr_lo;
};

static LIST_HEAD(direct_window_list);
@@ -725,16 +725,18 @@ static void remove_ddw(struct device_nod
{
struct dynamic_dma_window_prop *dwp;
struct property *win64;
- const u32 *ddw_avail;
+ u32 ddw_avail[3];
u64 liobn;
- int len, ret = 0;
+ int ret = 0;
+
+ ret = of_property_read_u32_array(np, "ibm,ddw-applicable",
+ &ddw_avail[0], 3);

- ddw_avail = of_get_property(np, "ibm,ddw-applicable", &len);
win64 = of_find_property(np, DIRECT64_PROPNAME, NULL);
if (!win64)
return;

- if (!ddw_avail || len < 3 * sizeof(u32) || win64->length < sizeof(*dwp))
+ if (ret || win64->length < sizeof(*dwp))
goto delprop;

dwp = win64->value;
@@ -872,8 +874,9 @@ static int create_ddw(struct pci_dev *de

do {
/* extra outputs are LIOBN and dma-addr (hi, lo) */
- ret = rtas_call(ddw_avail[1], 5, 4, (u32 *)create, cfg_addr,
- BUID_HI(buid), BUID_LO(buid), page_shift, window_shift);
+ ret = rtas_call(ddw_avail[1], 5, 4, (u32 *)create,
+ cfg_addr, BUID_HI(buid), BUID_LO(buid),
+ page_shift, window_shift);
} while (rtas_busy_delay(ret));
dev_info(&dev->dev,
"ibm,create-pe-dma-window(%x) %x %x %x %x %x returned %d "
@@ -910,7 +913,7 @@ static u64 enable_ddw(struct pci_dev *de
int page_shift;
u64 dma_addr, max_addr;
struct device_node *dn;
- const u32 *uninitialized_var(ddw_avail);
+ u32 ddw_avail[3];
struct direct_window *window;
struct property *win64;
struct dynamic_dma_window_prop *ddwprop;
@@ -942,8 +945,9 @@ static u64 enable_ddw(struct pci_dev *de
* for the given node in that order.
* the property is actually in the parent, not the PE
*/
- ddw_avail = of_get_property(pdn, "ibm,ddw-applicable", &len);
- if (!ddw_avail || len < 3 * sizeof(u32))
+ ret = of_property_read_u32_array(pdn, "ibm,ddw-applicable",
+ &ddw_avail[0], 3);
+ if (ret)
goto out_failed;

/*
@@ -966,11 +970,11 @@ static u64 enable_ddw(struct pci_dev *de
dev_dbg(&dev->dev, "no free dynamic windows");
goto out_failed;
}
- if (be32_to_cpu(query.page_size) & 4) {
+ if (query.page_size & 4) {
page_shift = 24; /* 16MB */
- } else if (be32_to_cpu(query.page_size) & 2) {
+ } else if (query.page_size & 2) {
page_shift = 16; /* 64kB */
- } else if (be32_to_cpu(query.page_size) & 1) {
+ } else if (query.page_size & 1) {
page_shift = 12; /* 4kB */
} else {
dev_dbg(&dev->dev, "no supported direct page size in mask %x",
@@ -980,7 +984,7 @@ static u64 enable_ddw(struct pci_dev *de
/* verify the window * number of ptes will map the partition */
/* check largest block * page size > max memory hotplug addr */
max_addr = memory_hotplug_max();
- if (be32_to_cpu(query.largest_available_block) < (max_addr >> page_shift)) {
+ if (query.largest_available_block < (max_addr >> page_shift)) {
dev_dbg(&dev->dev, "can't map partiton max 0x%llx with %u "
"%llu-sized pages\n", max_addr, query.largest_available_block,
1ULL << page_shift);
@@ -1006,8 +1010,9 @@ static u64 enable_ddw(struct pci_dev *de
if (ret != 0)
goto out_free_prop;

- ddwprop->liobn = create.liobn;
- ddwprop->dma_base = cpu_to_be64(of_read_number(&create.addr_hi, 2));
+ ddwprop->liobn = cpu_to_be32(create.liobn);
+ ddwprop->dma_base = cpu_to_be64(((u64)create.addr_hi << 32) |
+ create.addr_lo);
ddwprop->tce_shift = cpu_to_be32(page_shift);
ddwprop->window_shift = cpu_to_be32(len);

@@ -1039,7 +1044,7 @@ static u64 enable_ddw(struct pci_dev *de
list_add(&window->list, &direct_window_list);
spin_unlock(&direct_window_list_lock);

- dma_addr = of_read_number(&create.addr_hi, 2);
+ dma_addr = be64_to_cpu(ddwprop->dma_base);
goto out_unlock;

out_free_window:

2014-10-28 03:40:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 092/146] xfs: fix agno increment in xfs_inumbers() loop

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Sandeen <[email protected]>

commit a8b1ee8bafc765ebf029d03c5479a69aebff9693 upstream.

caused a regression in xfs_inumbers, which in turn broke
xfsdump, causing incomplete dumps.

The loop in xfs_inumbers() needs to fill the user-supplied
buffers, and iterates via xfs_btree_increment, reading new
ags as needed.

But the first time through the loop, if xfs_btree_increment()
succeeds, we continue, which triggers the ++agno at the bottom
of the loop, and we skip to soon to the next ag - without
the proper setup under next_ag to read the next ag.

Fix this by removing the agno increment from the loop conditional,
and only increment agno if we have actually hit the code under
the next_ag: target.

Signed-off-by: Eric Sandeen <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/xfs/xfs_itable.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -639,7 +639,8 @@ next_ag:
xfs_buf_relse(agbp);
agbp = NULL;
agino = 0;
- } while (++agno < mp->m_sb.sb_agcount);
+ agno++;
+ } while (agno < mp->m_sb.sb_agcount);

if (!error) {
if (bufidx) {

2014-10-28 03:40:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 090/146] vfio-pci: Fix remove path locking

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alex Williamson <[email protected]>

commit 93899a679fd6b2534b5c297d9316bae039ebcbe1 upstream.

Locking both the remove() and release() path results in a deadlock
that should have been obvious. To fix this we can get and hold the
vfio_device reference as we evaluate whether to do a bus/slot reset.
This will automatically block any remove() calls, allowing us to
remove the explict lock. Fixes 61d792562b53.

Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/vfio/pci/vfio_pci.c | 138 ++++++++++++++++++--------------------------
1 file changed, 58 insertions(+), 80 deletions(-)

--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -876,15 +876,11 @@ static void vfio_pci_remove(struct pci_d
{
struct vfio_pci_device *vdev;

- mutex_lock(&driver_lock);
-
vdev = vfio_del_group_dev(&pdev->dev);
if (vdev) {
iommu_group_put(pdev->dev.iommu_group);
kfree(vdev);
}
-
- mutex_unlock(&driver_lock);
}

static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
@@ -927,108 +923,90 @@ static struct pci_driver vfio_pci_driver
.err_handler = &vfio_err_handlers,
};

-/*
- * Test whether a reset is necessary and possible. We mark devices as
- * needs_reset when they are released, but don't have a function-local reset
- * available. If any of these exist in the affected devices, we want to do
- * a bus/slot reset. We also need all of the affected devices to be unused,
- * so we abort if any device has a non-zero refcnt. driver_lock prevents a
- * device from being opened during the scan or unbound from vfio-pci.
- */
-static int vfio_pci_test_bus_reset(struct pci_dev *pdev, void *data)
-{
- bool *needs_reset = data;
- struct pci_driver *pci_drv = ACCESS_ONCE(pdev->driver);
- int ret = -EBUSY;
-
- if (pci_drv == &vfio_pci_driver) {
- struct vfio_device *device;
- struct vfio_pci_device *vdev;
-
- device = vfio_device_get_from_dev(&pdev->dev);
- if (!device)
- return ret;
-
- vdev = vfio_device_data(device);
- if (vdev) {
- if (vdev->needs_reset)
- *needs_reset = true;
-
- if (!vdev->refcnt)
- ret = 0;
- }
-
- vfio_device_put(device);
- }
-
- /*
- * TODO: vfio-core considers groups to be viable even if some devices
- * are attached to known drivers, like pci-stub or pcieport. We can't
- * freeze devices from being unbound to those drivers like we can
- * here though, so it would be racy to test for them. We also can't
- * use device_lock() to prevent changes as that would interfere with
- * PCI-core taking device_lock during bus reset. For now, we require
- * devices to be bound to vfio-pci to get a bus/slot reset on release.
- */
-
- return ret;
-}
+struct vfio_devices {
+ struct vfio_device **devices;
+ int cur_index;
+ int max_index;
+};

-/* Clear needs_reset on all affected devices after successful bus/slot reset */
-static int vfio_pci_clear_needs_reset(struct pci_dev *pdev, void *data)
+static int vfio_pci_get_devs(struct pci_dev *pdev, void *data)
{
+ struct vfio_devices *devs = data;
struct pci_driver *pci_drv = ACCESS_ONCE(pdev->driver);

- if (pci_drv == &vfio_pci_driver) {
- struct vfio_device *device;
- struct vfio_pci_device *vdev;
-
- device = vfio_device_get_from_dev(&pdev->dev);
- if (!device)
- return 0;
-
- vdev = vfio_device_data(device);
- if (vdev)
- vdev->needs_reset = false;
+ if (pci_drv != &vfio_pci_driver)
+ return -EBUSY;

- vfio_device_put(device);
- }
+ if (devs->cur_index == devs->max_index)
+ return -ENOSPC;

+ devs->devices[devs->cur_index] = vfio_device_get_from_dev(&pdev->dev);
+ if (!devs->devices[devs->cur_index])
+ return -EINVAL;
+
+ devs->cur_index++;
return 0;
}

/*
* Attempt to do a bus/slot reset if there are devices affected by a reset for
* this device that are needs_reset and all of the affected devices are unused
- * (!refcnt). Callers of this function are required to hold driver_lock such
- * that devices can not be unbound from vfio-pci or opened by a user while we
- * test for and perform a bus/slot reset.
+ * (!refcnt). Callers are required to hold driver_lock when calling this to
+ * prevent device opens and concurrent bus reset attempts. We prevent device
+ * unbinds by acquiring and holding a reference to the vfio_device.
+ *
+ * NB: vfio-core considers a group to be viable even if some devices are
+ * bound to drivers like pci-stub or pcieport. Here we require all devices
+ * to be bound to vfio_pci since that's the only way we can be sure they
+ * stay put.
*/
static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev)
{
+ struct vfio_devices devs = { .cur_index = 0 };
+ int i = 0, ret = -EINVAL;
bool needs_reset = false, slot = false;
- int ret;
+ struct vfio_pci_device *tmp;

if (!pci_probe_reset_slot(vdev->pdev->slot))
slot = true;
else if (pci_probe_reset_bus(vdev->pdev->bus))
return;

- if (vfio_pci_for_each_slot_or_bus(vdev->pdev,
- vfio_pci_test_bus_reset,
- &needs_reset, slot) || !needs_reset)
+ if (vfio_pci_for_each_slot_or_bus(vdev->pdev, vfio_pci_count_devs,
+ &i, slot) || !i)
return;

- if (slot)
- ret = pci_try_reset_slot(vdev->pdev->slot);
- else
- ret = pci_try_reset_bus(vdev->pdev->bus);
-
- if (ret)
+ devs.max_index = i;
+ devs.devices = kcalloc(i, sizeof(struct vfio_device *), GFP_KERNEL);
+ if (!devs.devices)
return;

- vfio_pci_for_each_slot_or_bus(vdev->pdev,
- vfio_pci_clear_needs_reset, NULL, slot);
+ if (vfio_pci_for_each_slot_or_bus(vdev->pdev,
+ vfio_pci_get_devs, &devs, slot))
+ goto put_devs;
+
+ for (i = 0; i < devs.cur_index; i++) {
+ tmp = vfio_device_data(devs.devices[i]);
+ if (tmp->needs_reset)
+ needs_reset = true;
+ if (tmp->refcnt)
+ goto put_devs;
+ }
+
+ if (needs_reset)
+ ret = slot ? pci_try_reset_slot(vdev->pdev->slot) :
+ pci_try_reset_bus(vdev->pdev->bus);
+
+put_devs:
+ for (i = 0; i < devs.cur_index; i++) {
+ if (!ret) {
+ tmp = vfio_device_data(devs.devices[i]);
+ tmp->needs_reset = false;
+ }
+ vfio_device_put(devs.devices[i]);
+ }
+
+ kfree(devs.devices);
}

static void __exit vfio_pci_cleanup(void)

2014-10-28 07:41:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 091/146] xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Chinner <[email protected]>

commit 0d085a529b427d97710e6a41f8a4f23e1757cd12 upstream.

XFS has been having trouble with stray delayed allocation extents
beyond EOF for a long time. Recent changes to the collapse range
code has triggered erroneous EBUSY errors on page invalidtion for
block size smaller than page size filesystems. These
have been caused by dirty buffers beyond EOF on a partial page which
do not get written to disk during a sync.

The issue is that write-ahead in xfs_cluster_write() finds such a
partial page and handles it by leaving the page dirty but pushing it
into a writeback state. This used to work just fine, as the
write_cache_pages() code would then find the dirty partial page in
the next mapping tree lookup as the dirty tag is still set.

Unfortunately, when we moved to a mark and sweep approach to
writeback to fix other writeback sync issues, we broken this. THe
act of marking the page as under writeback now clears the TOWRITE
tag in the radix tree, even though the page is still dirty. This
causes the TOWRITE tag to be cleared, and hence the next lookup on
the mapping tree does not find the dirty partial page and so doesn't
try to write it again.

This same writeback bug was found recently in ext4 and fixed in
commit 1c8349a ("ext4: fix data integrity sync in ordered mode")
without communication to the wider filesystem community. We can use
exactly the same fix here so the TOWRITE flag is not cleared on
partial page writes.

cc: [email protected] # dependent on 1c8349a17137b93f0a83f276c764a6df1b9a116e
Root-cause-found-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Brian Foster <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/xfs/xfs_aops.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -434,10 +434,22 @@ xfs_start_page_writeback(
{
ASSERT(PageLocked(page));
ASSERT(!PageWriteback(page));
- if (clear_dirty)
+
+ /*
+ * if the page was not fully cleaned, we need to ensure that the higher
+ * layers come back to it correctly. That means we need to keep the page
+ * dirty, and for WB_SYNC_ALL writeback we need to ensure the
+ * PAGECACHE_TAG_TOWRITE index mark is not removed so another attempt to
+ * write this page in this writeback sweep will be made.
+ */
+ if (clear_dirty) {
clear_page_dirty_for_io(page);
- set_page_writeback(page);
+ set_page_writeback(page);
+ } else
+ set_page_writeback_keepwrite(page);
+
unlock_page(page);
+
/* If no buffers on the page are to be written, finish it here */
if (!buffers)
end_page_writeback(page);

2014-10-28 07:41:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 089/146] udf: Fix loading of special inodes

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jan Kara <[email protected]>

commit 6174c2eb8ecef271159bdcde460ce8af54d8f72f upstream.

Some UDF media have special inodes (like VAT or metadata partition
inodes) whose link_count is 0. Thus commit 4071b9136223 (udf: Properly
detect stale inodes) broke loading these inodes because udf_iget()
started returning -ESTALE for them. Since we still need to properly
detect stale inodes queried by NFS, create two variants of udf_iget() -
one which is used for looking up special inodes (which ignores
link_count == 0) and one which is used for other cases which return
ESTALE when link_count == 0.

Fixes: 4071b913622316970d0e1919f7d82b4403fec5f2
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/udf/inode.c | 14 +++++++++-----
fs/udf/super.c | 10 +++++-----
fs/udf/udfdecl.h | 13 ++++++++++++-
3 files changed, 26 insertions(+), 11 deletions(-)

--- a/fs/udf/inode.c
+++ b/fs/udf/inode.c
@@ -1277,7 +1277,7 @@ update_time:
*/
#define UDF_MAX_ICB_NESTING 1024

-static int udf_read_inode(struct inode *inode)
+static int udf_read_inode(struct inode *inode, bool hidden_inode)
{
struct buffer_head *bh = NULL;
struct fileEntry *fe;
@@ -1436,8 +1436,11 @@ reread:

link_count = le16_to_cpu(fe->fileLinkCount);
if (!link_count) {
- ret = -ESTALE;
- goto out;
+ if (!hidden_inode) {
+ ret = -ESTALE;
+ goto out;
+ }
+ link_count = 1;
}
set_nlink(inode, link_count);

@@ -1826,7 +1829,8 @@ out:
return err;
}

-struct inode *udf_iget(struct super_block *sb, struct kernel_lb_addr *ino)
+struct inode *__udf_iget(struct super_block *sb, struct kernel_lb_addr *ino,
+ bool hidden_inode)
{
unsigned long block = udf_get_lb_pblock(sb, ino, 0);
struct inode *inode = iget_locked(sb, block);
@@ -1839,7 +1843,7 @@ struct inode *udf_iget(struct super_bloc
return inode;

memcpy(&UDF_I(inode)->i_location, ino, sizeof(struct kernel_lb_addr));
- err = udf_read_inode(inode);
+ err = udf_read_inode(inode, hidden_inode);
if (err < 0) {
iget_failed(inode);
return ERR_PTR(err);
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -959,7 +959,7 @@ struct inode *udf_find_metadata_inode_ef
addr.logicalBlockNum = meta_file_loc;
addr.partitionReferenceNum = partition_num;

- metadata_fe = udf_iget(sb, &addr);
+ metadata_fe = udf_iget_special(sb, &addr);

if (IS_ERR(metadata_fe)) {
udf_warn(sb, "metadata inode efe not found\n");
@@ -1020,7 +1020,7 @@ static int udf_load_metadata_files(struc
udf_debug("Bitmap file location: block = %d part = %d\n",
addr.logicalBlockNum, addr.partitionReferenceNum);

- fe = udf_iget(sb, &addr);
+ fe = udf_iget_special(sb, &addr);
if (IS_ERR(fe)) {
if (sb->s_flags & MS_RDONLY)
udf_warn(sb, "bitmap inode efe not found but it's ok since the disc is mounted read-only\n");
@@ -1119,7 +1119,7 @@ static int udf_fill_partdesc_info(struct
};
struct inode *inode;

- inode = udf_iget(sb, &loc);
+ inode = udf_iget_special(sb, &loc);
if (IS_ERR(inode)) {
udf_debug("cannot load unallocSpaceTable (part %d)\n",
p_index);
@@ -1154,7 +1154,7 @@ static int udf_fill_partdesc_info(struct
};
struct inode *inode;

- inode = udf_iget(sb, &loc);
+ inode = udf_iget_special(sb, &loc);
if (IS_ERR(inode)) {
udf_debug("cannot load freedSpaceTable (part %d)\n",
p_index);
@@ -1198,7 +1198,7 @@ static void udf_find_vat_block(struct su
vat_block >= map->s_partition_root &&
vat_block >= start_block - 3; vat_block--) {
ino.logicalBlockNum = vat_block - map->s_partition_root;
- inode = udf_iget(sb, &ino);
+ inode = udf_iget_special(sb, &ino);
if (!IS_ERR(inode)) {
sbi->s_vat_inode = inode;
break;
--- a/fs/udf/udfdecl.h
+++ b/fs/udf/udfdecl.h
@@ -138,7 +138,18 @@ extern int udf_write_fi(struct inode *in
/* file.c */
extern long udf_ioctl(struct file *, unsigned int, unsigned long);
/* inode.c */
-extern struct inode *udf_iget(struct super_block *, struct kernel_lb_addr *);
+extern struct inode *__udf_iget(struct super_block *, struct kernel_lb_addr *,
+ bool hidden_inode);
+static inline struct inode *udf_iget_special(struct super_block *sb,
+ struct kernel_lb_addr *ino)
+{
+ return __udf_iget(sb, ino, true);
+}
+static inline struct inode *udf_iget(struct super_block *sb,
+ struct kernel_lb_addr *ino)
+{
+ return __udf_iget(sb, ino, false);
+}
extern int udf_expand_file_adinicb(struct inode *);
extern struct buffer_head *udf_expand_dir_adinicb(struct inode *, int *, int *);
extern struct buffer_head *udf_bread(struct inode *, int, int, int *);

2014-10-28 03:40:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 087/146] ARM: dts: imx28-evk: Let i2c0 run at 100kHz

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Fabio Estevam <[email protected]>

commit d1e61eb443dc7512885dfe89ee2f2a1c29fcb1da upstream.

Commit 78b81f4666fb ("ARM: dts: imx28-evk: Run I2C0 at 400kHz") caused issues
when doing the following sequence in loop:

- Boot the kernel
- Perform audio playback
- Reboot the system via 'reboot' command

In many times the audio card cannot be probed, which causes playback to fail.

After restoring to the original i2c0 frequency of 100kHz there is no such
problem anymore.

This reverts commit 78b81f4666fbb22a20b1e63e5baf197ad2e90e88.

Signed-off-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/imx28-evk.dts | 1 -
1 file changed, 1 deletion(-)

--- a/arch/arm/boot/dts/imx28-evk.dts
+++ b/arch/arm/boot/dts/imx28-evk.dts
@@ -193,7 +193,6 @@
i2c0: i2c@80058000 {
pinctrl-names = "default";
pinctrl-0 = <&i2c0_pins_a>;
- clock-frequency = <400000>;
status = "okay";

sgtl5000: codec@0a {

2014-10-28 07:42:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 086/146] ARM: mvebu: Netgear RN102: Use Hardware BCH ECC

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "[email protected]" <[email protected]>

commit ace8578182dc347b043c0825b9873f62fdaa5b77 upstream.

The bootloader on the Netgear ReadyNAS RN102 uses Hardware BCH ECC
(strength = 4), while the pxa3xx NAND driver by default uses
Hamming ECC (strength = 1).

This patch changes the ECC mode on these machines to match that
of the bootloader and of the stock firmware. That way, it is
now possible to update the kernel from userland (e.g. using
standard tools from mtd-utils package); u-boot will happily
load and boot it.

Fixes: 92beaccd8b49 ("ARM: mvebu: Enable NAND controller in ReadyNAS 102 .dts file")
Signed-off-by: Ben Peddell <[email protected]>
Acked-by: Ezequiel Garcia <[email protected]>
Tested-by: Arnaud Ebalard <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jason Cooper <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/armada-370-netgear-rn102.dts | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/arm/boot/dts/armada-370-netgear-rn102.dts
+++ b/arch/arm/boot/dts/armada-370-netgear-rn102.dts
@@ -143,6 +143,10 @@
marvell,nand-enable-arbiter;
nand-on-flash-bbt;

+ /* Use Hardware BCH ECC */
+ nand-ecc-strength = <4>;
+ nand-ecc-step-size = <512>;
+
partition@0 {
label = "u-boot";
reg = <0x0000000 0x180000>; /* 1.5MB */

2014-10-28 07:43:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 101/146] Bluetooth: Fix issue with USB suspend in btusb driver

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Champion Chen <[email protected]>

commit 85560c4a828ec9c8573840c9b66487b6ae584768 upstream.

Suspend could fail for some platforms because
btusb_suspend==> btusb_stop_traffic ==> usb_kill_anchored_urbs.

When btusb_bulk_complete returns before system suspend and resubmits
an URB, the system cannot enter suspend state.

Signed-off-by: Champion Chen <[email protected]>
Signed-off-by: Larry Finger <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bluetooth/btusb.c | 9 +++++++++
1 file changed, 9 insertions(+)

--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -330,6 +330,9 @@ static void btusb_intr_complete(struct u
BT_ERR("%s corrupted event packet", hdev->name);
hdev->stat.err_rx++;
}
+ } else if (urb->status == -ENOENT) {
+ /* Avoid suspend failed when usb_kill_urb */
+ return;
}

if (!test_bit(BTUSB_INTR_RUNNING, &data->flags))
@@ -418,6 +421,9 @@ static void btusb_bulk_complete(struct u
BT_ERR("%s corrupted ACL packet", hdev->name);
hdev->stat.err_rx++;
}
+ } else if (urb->status == -ENOENT) {
+ /* Avoid suspend failed when usb_kill_urb */
+ return;
}

if (!test_bit(BTUSB_BULK_RUNNING, &data->flags))
@@ -512,6 +518,9 @@ static void btusb_isoc_complete(struct u
hdev->stat.err_rx++;
}
}
+ } else if (urb->status == -ENOENT) {
+ /* Avoid suspend failed when usb_kill_urb */
+ return;
}

if (!test_bit(BTUSB_ISOC_RUNNING, &data->flags))

2014-10-28 07:43:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 099/146] Bluetooth: Fix HCI H5 corrupted ack value

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Loic Poulain <[email protected]>

commit 4807b51895dce8aa650ebebc51fa4a795ed6b8b8 upstream.

In this expression: seq = (seq - 1) % 8
seq (u8) is implicitly converted to an int in the arithmetic operation.
So if seq value is 0, operation is ((0 - 1) % 8) => (-1 % 8) => -1.
The new seq value is 0xff which is an invalid ACK value, we expect 0x07.
It leads to frequent dropped ACK and retransmission.
Fix this by using '&' binary operator instead of '%'.

Signed-off-by: Loic Poulain <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bluetooth/hci_h5.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/bluetooth/hci_h5.c
+++ b/drivers/bluetooth/hci_h5.c
@@ -237,7 +237,7 @@ static void h5_pkt_cull(struct h5 *h5)
break;

to_remove--;
- seq = (seq - 1) % 8;
+ seq = (seq - 1) & 0x07;
}

if (seq != h5->rx_ack)

2014-10-28 07:43:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 097/146] rt2800: correct BBP1_TX_POWER_CTRL mask

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Stanislaw Gruszka <[email protected]>

commit 01f7feeaf4528bec83798316b3c811701bac5d3e upstream.

Two bits control TX power on BBP_R1 register. Correct the mask,
otherwise we clear additional bit on BBP_R1 register, what can have
unknown, possible negative effect.

Signed-off-by: Stanislaw Gruszka <[email protected]>
Signed-off-by: John W. Linville <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/rt2x00/rt2800.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/wireless/rt2x00/rt2800.h
+++ b/drivers/net/wireless/rt2x00/rt2800.h
@@ -2039,7 +2039,7 @@ struct mac_iveiv_entry {
* 2 - drop tx power by 12dBm,
* 3 - increase tx power by 6dBm
*/
-#define BBP1_TX_POWER_CTRL FIELD8(0x07)
+#define BBP1_TX_POWER_CTRL FIELD8(0x03)
#define BBP1_TX_ANTENNA FIELD8(0x18)

/*

2014-10-28 03:39:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 095/146] PCI: Increase IBM ipr SAS Crocodile BARs to at least system page size

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Douglas Lehr <[email protected]>

commit 9fe373f9997b48fcd6222b95baf4a20c134b587a upstream.

The Crocodile chip occasionally comes up with 4k and 8k BAR sizes. Due to
an erratum, setting the SR-IOV page size causes the physical function BARs
to expand to the system page size. Since ppc64 uses 64k pages, when Linux
tries to assign the smaller resource sizes to the now 64k BARs the address
will be truncated and the BARs will overlap.

Force Linux to allocate the resource as a full page, which avoids the
overlap.

[bhelgaas: print expanded resource, too]
Signed-off-by: Douglas Lehr <[email protected]>
Signed-off-by: Anton Blanchard <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Milton Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/pci/quirks.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -24,6 +24,7 @@
#include <linux/ioport.h>
#include <linux/sched.h>
#include <linux/ktime.h>
+#include <linux/mm.h>
#include <asm/dma.h> /* isa_dma_bridge_buggy */
#include "pci.h"

@@ -287,6 +288,25 @@ static void quirk_citrine(struct pci_dev
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, PCI_DEVICE_ID_IBM_CITRINE, quirk_citrine);

+/* On IBM Crocodile ipr SAS adapters, expand BAR to system page size */
+static void quirk_extend_bar_to_page(struct pci_dev *dev)
+{
+ int i;
+
+ for (i = 0; i < PCI_STD_RESOURCE_END; i++) {
+ struct resource *r = &dev->resource[i];
+
+ if (r->flags & IORESOURCE_MEM && resource_size(r) < PAGE_SIZE) {
+ r->end = PAGE_SIZE - 1;
+ r->start = 0;
+ r->flags |= IORESOURCE_UNSET;
+ dev_info(&dev->dev, "expanded BAR %d to page size: %pR\n",
+ i, r);
+ }
+ }
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_IBM, 0x034a, quirk_extend_bar_to_page);
+
/*
* S3 868 and 968 chips report region size equal to 32M, but they decode 64M.
* If it's needed, re-allocate the region.

2014-10-28 07:44:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 096/146] PCI: Generate uppercase hex for modalias interface class

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ricardo Ribalda Delgado <[email protected]>

commit 89ec3dcf17fd3fa009ecf8faaba36828dd6bc416 upstream.

Some implementations of modprobe fail to load the driver for a PCI device
automatically because the "interface" part of the modalias from the kernel
is lowercase, and the modalias from file2alias is uppercase.

The "interface" is the low-order byte of the Class Code, defined in PCI
r3.0, Appendix D. Most interface types defined in the spec do not use
alpha characters, so they won't be affected. For example, 00h, 01h, 10h,
20h, etc. are unaffected.

Print the "interface" byte of the Class Code in uppercase hex, as we
already do for the Vendor ID, Device ID, Class, etc.

[bhelgaas: changelog]
Signed-off-by: Ricardo Ribalda Delgado <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/pci/pci-sysfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -177,7 +177,7 @@ static ssize_t modalias_show(struct devi
{
struct pci_dev *pci_dev = to_pci_dev(dev);

- return sprintf(buf, "pci:v%08Xd%08Xsv%08Xsd%08Xbc%02Xsc%02Xi%02x\n",
+ return sprintf(buf, "pci:v%08Xd%08Xsv%08Xsd%08Xbc%02Xsc%02Xi%02X\n",
pci_dev->vendor, pci_dev->device,
pci_dev->subsystem_vendor, pci_dev->subsystem_device,
(u8)(pci_dev->class >> 16), (u8)(pci_dev->class >> 8),

2014-10-28 07:45:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 084/146] ARM: mvebu: Netgear RN104: Use Hardware BCH ECC

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnaud Ebalard <[email protected]>

commit 225b94cdf719d0bc522a354bdafc18e5da5ff83b upstream.

The bootloader on the Netgear ReadyNAS RN104 uses Hardware BCH
ECC (strength = 4), while the pxa3xx NAND driver by default uses
Hamming ECC (strength = 1).

This patch changes the ECC mode on these machines to match that
of the bootloader and of the stock firmware. That way, it is
now possible to update the kernel from userland (e.g. using
standard tools from mtd-utils package); u-boot will happily
load and boot it.

The issue was initially reported and fixed by Ben Pedell for
RN102. The RN104 shares the same Hynix H27U1G8F2BTR NAND
flash and setup. This patch is based on Ben's fix for RN102.

Fixes: 0373a558bd79 ("ARM: mvebu: Enable NAND controller in ReadyNAS 104 .dts file")
Signed-off-by: Arnaud Ebalard <[email protected]>
Link: https://lkml.kernel.org/r/920c7e7169dc6aaaa3eb4bced2336d38e77b8864.1410035142.git.arno@natisbad.org
Signed-off-by: Jason Cooper <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/armada-370-netgear-rn104.dts | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/arm/boot/dts/armada-370-netgear-rn104.dts
+++ b/arch/arm/boot/dts/armada-370-netgear-rn104.dts
@@ -145,6 +145,10 @@
marvell,nand-enable-arbiter;
nand-on-flash-bbt;

+ /* Use Hardware BCH ECC */
+ nand-ecc-strength = <4>;
+ nand-ecc-step-size = <512>;
+
partition@0 {
label = "u-boot";
reg = <0x0000000 0x180000>; /* 1.5MB */

2014-10-28 07:45:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 082/146] ARM: at91/PMC: dont forget to write PMC_PCDR register to disable clocks

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ludovic Desroches <[email protected]>

commit cfa1950e6c6b72251e80adc736af3c3d2907ab0e upstream.

When introducing support for sama5d3, the write to PMC_PCDR register has
been accidentally removed.

Reported-by: Nathalie Cyrille <[email protected]>
Signed-off-by: Ludovic Desroches <[email protected]>
Signed-off-by: Nicolas Ferre <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/mach-at91/clock.c | 1 +
1 file changed, 1 insertion(+)

--- a/arch/arm/mach-at91/clock.c
+++ b/arch/arm/mach-at91/clock.c
@@ -962,6 +962,7 @@ static int __init at91_clock_reset(void)
}

at91_pmc_write(AT91_PMC_SCDR, scdr);
+ at91_pmc_write(AT91_PMC_PCDR, pcdr);
if (cpu_is_sama5d3())
at91_pmc_write(AT91_PMC_PCDR1, pcdr1);


2014-10-28 07:46:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 080/146] ARM: at91/dt: Fix typo regarding can0_clk

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Dueck <[email protected]>

commit 0a51d644c20f5c88fd3a659119d1903f74927082 upstream.

Otherwise the clock for can0 will never get enabled.

Signed-off-by: David Dueck <[email protected]>
Signed-off-by: Anthony Harivel <[email protected]>
Acked-by: Boris Brezillon <[email protected]>
Signed-off-by: Nicolas Ferre <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/boot/dts/sama5d3_can.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/boot/dts/sama5d3_can.dtsi
+++ b/arch/arm/boot/dts/sama5d3_can.dtsi
@@ -40,7 +40,7 @@
atmel,clk-output-range = <0 66000000>;
};

- can1_clk: can0_clk {
+ can1_clk: can1_clk {
#clock-cells = <0>;
reg = <41>;
atmel,clk-output-range = <0 66000000>;

2014-10-28 03:39:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 076/146] ALSA: hda - hdmi: Fix missing ELD change event on plug/unplug

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Anssi Hannula <[email protected]>

commit 6acce400d9daf1353fbf497302670c90a3205e1d upstream.

The ELD ALSA control change event is sent by hdmi_present_sense() when
eld_changed is true.

Currently, it is only true when the ELD buffer contents have been
modified. However, the user-visible ELD controls also change to a
zero-length value and back when eld_valid is unset/set, and no event is
currently sent in such cases (such as when unplugging or replugging a
sink).

Fix the code to always set eld_changed if eld_valid value is changed,
and therefore to always send the change event when the user-visible
value changes.

Signed-off-by: Anssi Hannula <[email protected]>
Cc: David Henningsson <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/pci/hda/patch_hdmi.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)

--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -1577,19 +1577,22 @@ static bool hdmi_present_sense(struct hd
}
}

- if (pin_eld->eld_valid && !eld->eld_valid) {
- update_eld = true;
+ if (pin_eld->eld_valid != eld->eld_valid)
eld_changed = true;
- }
+
+ if (pin_eld->eld_valid && !eld->eld_valid)
+ update_eld = true;
+
if (update_eld) {
bool old_eld_valid = pin_eld->eld_valid;
pin_eld->eld_valid = eld->eld_valid;
- eld_changed = pin_eld->eld_size != eld->eld_size ||
+ if (pin_eld->eld_size != eld->eld_size ||
memcmp(pin_eld->eld_buffer, eld->eld_buffer,
- eld->eld_size) != 0;
- if (eld_changed)
+ eld->eld_size) != 0) {
memcpy(pin_eld->eld_buffer, eld->eld_buffer,
eld->eld_size);
+ eld_changed = true;
+ }
pin_eld->eld_size = eld->eld_size;
pin_eld->info = eld->info;


2014-10-28 03:39:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 075/146] ALSA: usb-audio: Add support for Steinberg UR22 USB interface

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vlad Catoi <[email protected]>

commit f0b127fbfdc8756eba7437ab668f3169280bd358 upstream.

Adding support for Steinberg UR22 USB interface via quirks table patch

See Ubuntu bug report:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317244
Also see threads:
http://linux-audio.4202.n7.nabble.com/Support-for-Steinberg-UR22-Yamaha-USB-chipset-0499-1509-tc82888.html#a82917
http://www.steinberg.net/forums/viewtopic.php?t=62290

Tested by at least 4 people judging by the threads.
Did not test MIDI interface, but audio output and capture both are
functional. Built 3.17 kernel with this driver on Ubuntu 14.04 & tested with mpg123
Patch applied to 3.13 Ubuntu kernel works well enough for daily use.

Signed-off-by: Vlad Catoi <[email protected]>
Acked-by: Clemens Ladisch <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/usb/quirks-table.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

--- a/sound/usb/quirks-table.h
+++ b/sound/usb/quirks-table.h
@@ -385,6 +385,36 @@ YAMAHA_DEVICE(0x105d, NULL),
}
},
{
+ USB_DEVICE(0x0499, 0x1509),
+ .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
+ /* .vendor_name = "Yamaha", */
+ /* .product_name = "Steinberg UR22", */
+ .ifnum = QUIRK_ANY_INTERFACE,
+ .type = QUIRK_COMPOSITE,
+ .data = (const struct snd_usb_audio_quirk[]) {
+ {
+ .ifnum = 1,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 2,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 3,
+ .type = QUIRK_MIDI_YAMAHA
+ },
+ {
+ .ifnum = 4,
+ .type = QUIRK_IGNORE_INTERFACE
+ },
+ {
+ .ifnum = -1
+ }
+ }
+ }
+},
+{
USB_DEVICE(0x0499, 0x150a),
.driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
/* .vendor_name = "Yamaha", */

2014-10-28 07:46:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 077/146] ALSA: hda - Fix inverted LED gpio setup for Lenovo Ideapad

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <[email protected]>

commit b1974f965a506c131b60cd3e483340884e831920 upstream.

We implemented in a wrong way for mute LED on Lenovo Ideapad; the bit
must be flipped.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16373
Fixes: 3e887f379d8a ('ALSA: hda - Add mute LED support to Lenovo Ideapad')
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/pci/hda/patch_realtek.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -5786,9 +5786,9 @@ static void alc662_led_gpio1_mute_hook(v
unsigned int oldval = spec->gpio_led;

if (enabled)
- spec->gpio_led &= ~0x01;
- else
spec->gpio_led |= 0x01;
+ else
+ spec->gpio_led &= ~0x01;
if (spec->gpio_led != oldval)
snd_hda_codec_write(codec, 0x01, 0, AC_VERB_SET_GPIO_DATA,
spec->gpio_led);

2014-10-28 07:47:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 071/146] ALSA: pcm: use the same dma mmap codepath both for arm and arm64

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Anatol Pomozov <[email protected]>

commit a011e213f3700233ed2a676f1ef0a74a052d7162 upstream.

This avoids following kernel crash when try to playback on arm64

[ 107.497203] [<ffffffc00046b310>] snd_pcm_mmap_data_fault+0x90/0xd4
[ 107.503405] [<ffffffc0001541ac>] __do_fault+0xb0/0x498
[ 107.508565] [<ffffffc0001576a0>] handle_mm_fault+0x224/0x7b0
[ 107.514246] [<ffffffc000092640>] do_page_fault+0x11c/0x310
[ 107.519738] [<ffffffc000081100>] do_mem_abort+0x38/0x98

Tested: backported to 3.14 and tried to playback on arm64 machine

Signed-off-by: Anatol Pomozov <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/core/pcm_native.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3193,7 +3193,7 @@ static const struct vm_operations_struct

#ifndef ARCH_HAS_DMA_MMAP_COHERENT
/* This should be defined / handled globally! */
-#ifdef CONFIG_ARM
+#if defined(CONFIG_ARM) || defined(CONFIG_ARM64)
#define ARCH_HAS_DMA_MMAP_COHERENT
#endif
#endif

2014-10-28 03:38:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 069/146] arm64: Fix compilation error on UP builds

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Catalin Marinas <[email protected]>

commit ceab3fe69408cb98f437dad3b4b4bb79434370ef upstream.

In file included from ./arch/arm64/include/asm/irq_work.h:4:0,
from include/linux/irq_work.h:46,
from include/linux/perf_event.h:49,
from include/linux/ftrace_event.h:9,
from include/trace/syscall.h:6,
from include/linux/syscalls.h:81,
from init/main.c:18:
./arch/arm64/include/asm/smp.h:24:3:
error: #error "<asm/smp.h> included in non-SMP build"
# error "<asm/smp.h> included in non-SMP build"

Signed-off-by: Catalin Marinas <[email protected]>
Fixes: 3631073659d0 ("arm64: Tell irq work about self IPI support")
Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/include/asm/irq_work.h | 11 +++++++++++
1 file changed, 11 insertions(+)

--- a/arch/arm64/include/asm/irq_work.h
+++ b/arch/arm64/include/asm/irq_work.h
@@ -1,6 +1,8 @@
#ifndef __ASM_IRQ_WORK_H
#define __ASM_IRQ_WORK_H

+#ifdef CONFIG_SMP
+
#include <asm/smp.h>

static inline bool arch_irq_work_has_interrupt(void)
@@ -8,4 +10,13 @@ static inline bool arch_irq_work_has_int
return !!__smp_cross_call;
}

+#else
+
+static inline bool arch_irq_work_has_interrupt(void)
+{
+ return false;
+}
+
+#endif
+
#endif /* __ASM_IRQ_WORK_H */

2014-10-28 07:48:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 068/146] spi: dw-mid: terminate ongoing transfers at exit

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Shevchenko <[email protected]>

commit 8e45ef682cb31fda62ed4eeede5d9745a0a1b1e2 upstream.

Do full clean up at exit, means terminate all ongoing DMA transfers.

Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/spi/spi-dw-mid.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/spi/spi-dw-mid.c
+++ b/drivers/spi/spi-dw-mid.c
@@ -91,7 +91,11 @@ static void mid_spi_dma_exit(struct dw_s
{
if (!dws->dma_inited)
return;
+
+ dmaengine_terminate_all(dws->txchan);
dma_release_channel(dws->txchan);
+
+ dmaengine_terminate_all(dws->rxchan);
dma_release_channel(dws->rxchan);
}


2014-10-28 07:49:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 065/146] NFSv4.1/pnfs: replace broken pnfs_put_lseg_async

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <[email protected]>

commit 6543f803670530f6aa93790d9fa116d8395a537d upstream.

You cannot call pnfs_put_lseg_async() more than once per lseg, so it
is really an inappropriate way to deal with a refcount issue.

Instead, replace it with a function that decrements the refcount, and
puts the final 'free' operation (which is incompatible with locks) on
the workqueue.

Cc: Weston Andros Adamson <[email protected]>
Fixes: e6cf82d1830f: pnfs: add pnfs_put_lseg_async
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/filelayout/filelayout.c | 2 +-
fs/nfs/pnfs.c | 33 +++++++++++++++++++++++++++------
fs/nfs/pnfs.h | 6 +-----
3 files changed, 29 insertions(+), 12 deletions(-)

--- a/fs/nfs/filelayout/filelayout.c
+++ b/fs/nfs/filelayout/filelayout.c
@@ -1031,7 +1031,7 @@ filelayout_clear_request_commit(struct n
}
out:
nfs_request_remove_commit_list(req, cinfo);
- pnfs_put_lseg_async(freeme);
+ pnfs_put_lseg_locked(freeme);
}

static void
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -361,22 +361,43 @@ pnfs_put_lseg(struct pnfs_layout_segment
}
EXPORT_SYMBOL_GPL(pnfs_put_lseg);

-static void pnfs_put_lseg_async_work(struct work_struct *work)
+static void pnfs_free_lseg_async_work(struct work_struct *work)
{
struct pnfs_layout_segment *lseg;
+ struct pnfs_layout_hdr *lo;

lseg = container_of(work, struct pnfs_layout_segment, pls_work);
+ lo = lseg->pls_layout;

- pnfs_put_lseg(lseg);
+ pnfs_free_lseg(lseg);
+ pnfs_put_layout_hdr(lo);
}

-void
-pnfs_put_lseg_async(struct pnfs_layout_segment *lseg)
+static void pnfs_free_lseg_async(struct pnfs_layout_segment *lseg)
{
- INIT_WORK(&lseg->pls_work, pnfs_put_lseg_async_work);
+ INIT_WORK(&lseg->pls_work, pnfs_free_lseg_async_work);
schedule_work(&lseg->pls_work);
}
-EXPORT_SYMBOL_GPL(pnfs_put_lseg_async);
+
+void
+pnfs_put_lseg_locked(struct pnfs_layout_segment *lseg)
+{
+ if (!lseg)
+ return;
+
+ assert_spin_locked(&lseg->pls_layout->plh_inode->i_lock);
+
+ dprintk("%s: lseg %p ref %d valid %d\n", __func__, lseg,
+ atomic_read(&lseg->pls_refcount),
+ test_bit(NFS_LSEG_VALID, &lseg->pls_flags));
+ if (atomic_dec_and_test(&lseg->pls_refcount)) {
+ struct pnfs_layout_hdr *lo = lseg->pls_layout;
+ pnfs_get_layout_hdr(lo);
+ pnfs_layout_remove_lseg(lo, lseg);
+ pnfs_free_lseg_async(lseg);
+ }
+}
+EXPORT_SYMBOL_GPL(pnfs_put_lseg_locked);

static u64
end_offset(u64 start, u64 len)
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -183,7 +183,7 @@ extern int nfs4_proc_layoutreturn(struct
/* pnfs.c */
void pnfs_get_layout_hdr(struct pnfs_layout_hdr *lo);
void pnfs_put_lseg(struct pnfs_layout_segment *lseg);
-void pnfs_put_lseg_async(struct pnfs_layout_segment *lseg);
+void pnfs_put_lseg_locked(struct pnfs_layout_segment *lseg);

void set_pnfs_layoutdriver(struct nfs_server *, const struct nfs_fh *, u32);
void unset_pnfs_layoutdriver(struct nfs_server *);
@@ -422,10 +422,6 @@ static inline void pnfs_put_lseg(struct
{
}

-static inline void pnfs_put_lseg_async(struct pnfs_layout_segment *lseg)
-{
-}
-
static inline int pnfs_return_layout(struct inode *ino)
{
return 0;

2014-10-28 07:49:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 063/146] NFS: Fix an uninitialised pointer Oops in the writeback error path

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <[email protected]>

commit 3caa0c6ed754d91b15266abf222498edbef982bd upstream.

SteveD reports the following Oops:
RIP: 0010:[<ffffffffa053461d>] [<ffffffffa053461d>] __put_nfs_open_context+0x1d/0x100 [nfs]
RSP: 0018:ffff880fed687b90 EFLAGS: 00010286
RAX: 0000000000000024 RBX: 0000000000000000 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880fed687bc0 R08: 0000000000000092 R09: 000000000000047a
R10: 0000000000000000 R11: ffff880fed6878d6 R12: ffff880fed687d20
R13: ffff880fed687d20 R14: 0000000000000070 R15: ffffea000aa33ec0
FS: 00007fce290f0740(0000) GS:ffff8807ffc60000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 00000007f2e79000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
0000000000000000 ffff880036c5e510 ffff880fed687d20 ffff880fed687d20
ffff880036c5e200 ffffea000aa33ec0 ffff880fed687bd0 ffffffffa0534710
ffff880fed687be8 ffffffffa053d5f0 ffff880036c5e200 ffff880fed687c08
Call Trace:
[<ffffffffa0534710>] put_nfs_open_context+0x10/0x20 [nfs]
[<ffffffffa053d5f0>] nfs_pgio_data_destroy+0x20/0x40 [nfs]
[<ffffffffa053d672>] nfs_pgio_error+0x22/0x40 [nfs]
[<ffffffffa053d8f4>] nfs_generic_pgio+0x74/0x2e0 [nfs]
[<ffffffffa06b18c3>] pnfs_generic_pg_writepages+0x63/0x210 [nfsv4]
[<ffffffffa053d579>] nfs_pageio_doio+0x19/0x50 [nfs]
[<ffffffffa053eb84>] nfs_pageio_complete+0x24/0x30 [nfs]
[<ffffffffa053cb25>] nfs_direct_write_schedule_iovec+0x115/0x1f0 [nfs]
[<ffffffffa053675f>] ? nfs_get_lock_context+0x4f/0x120 [nfs]
[<ffffffffa053d252>] nfs_file_direct_write+0x262/0x420 [nfs]
[<ffffffffa0532d91>] nfs_file_write+0x131/0x1d0 [nfs]
[<ffffffffa0532c60>] ? nfs_need_sync_write.isra.17+0x40/0x40 [nfs]
[<ffffffff812127b8>] do_io_submit+0x3b8/0x840
[<ffffffff81212c50>] SyS_io_submit+0x10/0x20
[<ffffffff81610f29>] system_call_fastpath+0x16/0x1b

This is due to the calls to nfs_pgio_error() in nfs_generic_pgio(), which
happen before the nfs_pgio_header's open context is referenced in
nfs_pgio_rpcsetup().

Reported-by: Steve Dickson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/pagelist.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -518,7 +518,8 @@ EXPORT_SYMBOL_GPL(nfs_pgio_header_free);
*/
void nfs_pgio_data_destroy(struct nfs_pgio_header *hdr)
{
- put_nfs_open_context(hdr->args.context);
+ if (hdr->args.context)
+ put_nfs_open_context(hdr->args.context);
if (hdr->page_array.pagevec != hdr->page_array.page_array)
kfree(hdr->page_array.pagevec);
}

2014-10-28 07:49:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 037/146] qla2xxx: Use correct offset to req-q-out for reserve calculation

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arun Easi <[email protected]>

commit 75554b68ac1e018bca00d68a430b92ada8ab52dd upstream.

Signed-off-by: Arun Easi <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/qla2xxx/qla_target.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -1431,12 +1431,10 @@ static inline void qlt_unmap_sg(struct s
static int qlt_check_reserve_free_req(struct scsi_qla_host *vha,
uint32_t req_cnt)
{
- struct qla_hw_data *ha = vha->hw;
- device_reg_t __iomem *reg = ha->iobase;
uint32_t cnt;

if (vha->req->cnt < (req_cnt + 2)) {
- cnt = (uint16_t)RD_REG_DWORD(&reg->isp24.req_q_out);
+ cnt = (uint16_t)RD_REG_DWORD(vha->req->req_q_out);

ql_dbg(ql_dbg_tgt, vha, 0xe00a,
"Request ring circled: cnt=%d, vha->->ring_index=%d, "

2014-10-28 07:50:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 045/146] Drivers: hv: vmbus: Cleanup vmbus_post_msg()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit fdeebcc62279119dbeafbc1a2e39e773839025fd upstream.

Posting messages to the host can fail because of transient resource
related failures. Correctly deal with these failures and increase the
number of attempts to post the message before giving up.

In this version of the patch, I have normalized the error code to
Linux error code.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Tested-by: Sitsofe Wheeler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/connection.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)

--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -427,10 +427,21 @@ int vmbus_post_msg(void *buffer, size_t
* insufficient resources. Retry the operation a couple of
* times before giving up.
*/
- while (retries < 3) {
- ret = hv_post_message(conn_id, 1, buffer, buflen);
- if (ret != HV_STATUS_INSUFFICIENT_BUFFERS)
+ while (retries < 10) {
+ ret = hv_post_message(conn_id, 1, buffer, buflen);
+
+ switch (ret) {
+ case HV_STATUS_INSUFFICIENT_BUFFERS:
+ ret = -ENOMEM;
+ case -ENOMEM:
+ break;
+ case HV_STATUS_SUCCESS:
return ret;
+ default:
+ pr_err("hv_post_msg() failed; error code:%d\n", ret);
+ return -EINVAL;
+ }
+
retries++;
msleep(100);
}

2014-10-28 07:51:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 043/146] arm64: debug: dont re-enable debug exceptions on return from el1_dbg

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Will Deacon <[email protected]>

commit 1059c6bf8534acda249e7e65c81e7696fb074dc1 upstream.

When returning from a debug exception taken from EL1, we unmask debug
exceptions after handling the exception. This is crucial for debug
exceptions taken from EL0, so that any kernel work on the ret_to_user
path can be debugged by kgdb.

However, when returning back to EL1 the only thing left to do is to
restore the original register state before the exception return. If
single-step has been enabled by the debug exception handler, we will
get stuck in an infinite debug exception loop, since we will take the
step exception as soon as we unmask debug exceptions.

This patch avoids unmasking debug exceptions on the debug exception
return path when the exception was taken from EL1.

Fixes: 2a2830703a23 (arm64: debug: avoid accessing mdscr_el1 on fault paths where possible)
Reported-by: David Long <[email protected]>
Reported-by: AKASHI Takahiro <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/kernel/entry.S | 1 -
1 file changed, 1 deletion(-)

--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -324,7 +324,6 @@ el1_dbg:
mrs x0, far_el1
mov x2, sp // struct pt_regs
bl do_debug_exception
- enable_dbg
kernel_exit 1
el1_inv:
// TODO: add support for undefined instructions in kernel mode

2014-10-28 03:38:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 040/146] dmaengine: pl330: Fix NULL pointer dereference on probe failure

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Krzysztof Kozlowski <[email protected]>

commit 0f5ebabdd03b471da1906f7edddc61ceb35cee02 upstream.

If dma_async_device_register() returns error and probe should clean up
and return error, a NULL pointer exception happens because of
dereference of not allocated channel thread:

Dmesg log (from early printk):
dma-pl330 12680000.pdma: unable to register DMAC
DMA pl330_control: removing pch: eeac4000, chan: eeac4014, thread: (null)
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
pgd = c0004000
[0000000c] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc3-next-20140904-00005-g6cc4c1937d90-dirty #427
task: ee80a800 ti: ee888000 task.ti: ee888000
PC is at _stop+0x8/0x2c8
LR is at pl330_control+0x70/0x2e8
pc : [<c0205dc8>] lr : [<c020623c>] psr: 60000193
sp : ee889df8 ip : 00000002 fp : 00000000
r10: eeac4014 r9 : ee0e62bc r8 : 00000000
r7 : eeac405c r6 : 60000113 r5 : ee0e6210 r4 : eeac4000
r3 : 00000002 r2 : 00000002 r1 : 00010000 r0 : 00000000
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 10c5387d Table: 4000404a DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xee888240)
Stack: (0xee889df8 to 0xee88a000)
9de0: 00000002 eeac4000
9e00: ee0e6210 eeac4000 ee0e6210 60000113 eeac405c c020623c 00000000 c020725c
9e20: ee889e20 ee889e20 ee0e6210 eeac4080 00200200 00100100 eeac4014 00000020
9e40: ee0e6218 c0208374 00000000 ee9bb340 ee0e6210 00000000 00000000 c0605cd8
9e60: ee970000 c0605c84 ee9700f8 00000000 c05c4270 00000000 00000000 c0203b3c
9e80: ee970000 c06624a8 00000000 c0605c84 00000000 c023f890 ee970000 c0605c84
9ea0: ee970034 00000000 c05b23d0 c023fa3c 00000000 c0605c84 c023f9b0 c023e0d4
9ec0: ee947e78 ee9b9440 c0605c84 eea1e780 c0605acc c023f094 c0513b50 c0605c84
9ee0: c05ecbd8 c0605c84 c05ecbd8 ee11ba40 c0626500 c0240064 00000000 c05ecbd8
9f00: c05ecbd8 c0008964 c040f13c 0000009f c0626500 c057465c ee80a800 60000113
9f20: 00000000 c05efdb0 60000113 00000000 ef7fc89d c0421168 0000008f c003787c
9f40: c0573d6c 00000006 ef7fc8bb 00000006 c05efd50 ef7fc800 c05dfbc4 00000006
9f60: c05c4264 c0626500 0000008f c05c4270 c059b518 c059bcb4 00000006 00000006
9f80: c059b518 c003c08c 00000000 c040091c 00000000 00000000 00000000 00000000
9fa0: 00000000 c0400924 00000000 c000e7b8 00000000 00000000 00000000 00000000
9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 c0c0c0c0 c0c0c0c0
[<c0205dc8>] (_stop) from [<c020623c>] (pl330_control+0x70/0x2e8)
[<c020623c>] (pl330_control) from [<c0208374>] (pl330_probe+0x594/0x75c)
[<c0208374>] (pl330_probe) from [<c0203b3c>] (amba_probe+0xb8/0x120)
[<c0203b3c>] (amba_probe) from [<c023f890>] (driver_probe_device+0x10c/0x22c)
[<c023f890>] (driver_probe_device) from [<c023fa3c>] (__driver_attach+0x8c/0x90)
[<c023fa3c>] (__driver_attach) from [<c023e0d4>] (bus_for_each_dev+0x54/0x88)
[<c023e0d4>] (bus_for_each_dev) from [<c023f094>] (bus_add_driver+0xd4/0x1d0)
[<c023f094>] (bus_add_driver) from [<c0240064>] (driver_register+0x78/0xf4)
[<c0240064>] (driver_register) from [<c0008964>] (do_one_initcall+0x80/0x1d0)
[<c0008964>] (do_one_initcall) from [<c059bcb4>] (kernel_init_freeable+0x108/0x1d4)
[<c059bcb4>] (kernel_init_freeable) from [<c0400924>] (kernel_init+0x8/0xec)
[<c0400924>] (kernel_init) from [<c000e7b8>] (ret_from_fork+0x14/0x3c)
Code: e5813010 e12fff1e e92d40f0 e24dd00c (e590200c)
---[ end trace c94b2f4f38dff3bf ]---

This happens because the necessary resources were not yet allocated - no
call to pl330_alloc_chan_resources().

Terminate the thread and free channel resource only if channel thread is not NULL.

Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: 0b94c5771705 ("DMA: PL330: Add check if device tree compatible")
Reviewed-by: Lars-Peter Clausen <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/dma/pl330.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2755,8 +2755,10 @@ probe_err3:
list_del(&pch->chan.device_node);

/* Flush the channel */
- pl330_control(&pch->chan, DMA_TERMINATE_ALL, 0);
- pl330_free_chan_resources(&pch->chan);
+ if (pch->thread) {
+ pl330_control(&pch->chan, DMA_TERMINATE_ALL, 0);
+ pl330_free_chan_resources(&pch->chan);
+ }
}
probe_err2:
pl330_del(pl330);

2014-10-28 07:51:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 042/146] firmware_class: make sure fw requests contain a name

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kees Cook <[email protected]>

commit 471b095dfe0d693a8d624cbc716d1ee4d74eb437 upstream.

An empty firmware request name will trigger warnings when building
device names. Make sure this is caught earlier and rejected.

The warning was visible via the test_firmware.ko module interface:

echo -ne "\x00" > /sys/devices/virtual/misc/test_firmware/trigger_request

Reported-by: Sasha Levin <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Tested-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/base/firmware_class.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -1105,6 +1105,9 @@ _request_firmware(const struct firmware
if (!firmware_p)
return -EINVAL;

+ if (!name || name[0] == '\0')
+ return -EINVAL;
+
ret = _request_firmware_prepare(&fw, name, device);
if (ret <= 0) /* error or already assigned */
goto out;

2014-10-28 07:52:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 060/146] NFSv4: fix open/lock state recovery error handling

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Trond Myklebust <[email protected]>

commit df817ba35736db2d62b07de6f050a4db53492ad8 upstream.

The current open/lock state recovery unfortunately does not handle errors
such as NFS4ERR_CONN_NOT_BOUND_TO_SESSION correctly. Instead of looping,
just proceeds as if the state manager is finished recovering.
This patch ensures that we loop back, handle higher priority errors
and complete the open/lock state recovery.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/nfs4state.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)

--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1705,7 +1705,8 @@ restart:
if (status < 0) {
set_bit(ops->owner_flag_bit, &sp->so_flags);
nfs4_put_state_owner(sp);
- return nfs4_recovery_handle_error(clp, status);
+ status = nfs4_recovery_handle_error(clp, status);
+ return (status != 0) ? status : -EAGAIN;
}

nfs4_put_state_owner(sp);
@@ -1714,7 +1715,7 @@ restart:
spin_unlock(&clp->cl_lock);
}
rcu_read_unlock();
- return status;
+ return 0;
}

static int nfs4_check_lease(struct nfs_client *clp)
@@ -2366,14 +2367,11 @@ static void nfs4_state_manager(struct nf
section = "reclaim reboot";
status = nfs4_do_reclaim(clp,
clp->cl_mvops->reboot_recovery_ops);
- if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state) ||
- test_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state))
- continue;
- nfs4_state_end_reclaim_reboot(clp);
- if (test_bit(NFS4CLNT_RECLAIM_NOGRACE, &clp->cl_state))
+ if (status == -EAGAIN)
continue;
if (status < 0)
goto out_error;
+ nfs4_state_end_reclaim_reboot(clp);
}

/* Now recover expired state... */
@@ -2381,9 +2379,7 @@ static void nfs4_state_manager(struct nf
section = "reclaim nograce";
status = nfs4_do_reclaim(clp,
clp->cl_mvops->nograce_recovery_ops);
- if (test_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state) ||
- test_bit(NFS4CLNT_SESSION_RESET, &clp->cl_state) ||
- test_bit(NFS4CLNT_RECLAIM_REBOOT, &clp->cl_state))
+ if (status == -EAGAIN)
continue;
if (status < 0)
goto out_error;

2014-10-28 07:52:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 058/146] nfs: fix duplicate proc entries

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Fabian Frederick <[email protected]>

commit 2f3169fb18f4643ac9a6a097a6a6c71f0b2cef75 upstream.

Commit 65b38851a174
("NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes")

updated the following function:
static int nfs_volume_list_open(struct inode *inode, struct file *file)

it used &nfs_server_list_ops instead of &nfs_volume_list_ops
which means cat /proc/fs/nfsfs/volumes = /proc/fs/nfsfs/servers

Signed-off-by: Fabian Frederick <[email protected]>
Fixes: 65b38851a174 (NFS: Fix /proc/fs/nfsfs/servers and...)
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/client.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -1318,7 +1318,7 @@ static int nfs_server_list_show(struct s
*/
static int nfs_volume_list_open(struct inode *inode, struct file *file)
{
- return seq_open_net(inode, file, &nfs_server_list_ops,
+ return seq_open_net(inode, file, &nfs_volume_list_ops,
sizeof(struct seq_net_private));
}


2014-10-28 03:38:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 055/146] Revert "lzo: properly check for overruns"

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <[email protected]>

commit af958a38a60c7ca3d8a39c918c1baa2ff7b6b233 upstream.

This reverts commit 206a81c ("lzo: properly check for overruns").

As analysed by Willem Pinckaers, this fix is still incomplete on
certain rare corner cases, and it is easier to restart from the
original code.

Reported-by: Willem Pinckaers <[email protected]>
Cc: "Don A. Bailey" <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
lib/lzo/lzo1x_decompress_safe.c | 62 +++++++++++++---------------------------
1 file changed, 21 insertions(+), 41 deletions(-)

--- a/lib/lzo/lzo1x_decompress_safe.c
+++ b/lib/lzo/lzo1x_decompress_safe.c
@@ -19,31 +19,11 @@
#include <linux/lzo.h>
#include "lzodefs.h"

-#define HAVE_IP(t, x) \
- (((size_t)(ip_end - ip) >= (size_t)(t + x)) && \
- (((t + x) >= t) && ((t + x) >= x)))
-
-#define HAVE_OP(t, x) \
- (((size_t)(op_end - op) >= (size_t)(t + x)) && \
- (((t + x) >= t) && ((t + x) >= x)))
-
-#define NEED_IP(t, x) \
- do { \
- if (!HAVE_IP(t, x)) \
- goto input_overrun; \
- } while (0)
-
-#define NEED_OP(t, x) \
- do { \
- if (!HAVE_OP(t, x)) \
- goto output_overrun; \
- } while (0)
-
-#define TEST_LB(m_pos) \
- do { \
- if ((m_pos) < out) \
- goto lookbehind_overrun; \
- } while (0)
+#define HAVE_IP(x) ((size_t)(ip_end - ip) >= (size_t)(x))
+#define HAVE_OP(x) ((size_t)(op_end - op) >= (size_t)(x))
+#define NEED_IP(x) if (!HAVE_IP(x)) goto input_overrun
+#define NEED_OP(x) if (!HAVE_OP(x)) goto output_overrun
+#define TEST_LB(m_pos) if ((m_pos) < out) goto lookbehind_overrun

int lzo1x_decompress_safe(const unsigned char *in, size_t in_len,
unsigned char *out, size_t *out_len)
@@ -78,14 +58,14 @@ int lzo1x_decompress_safe(const unsigned
while (unlikely(*ip == 0)) {
t += 255;
ip++;
- NEED_IP(1, 0);
+ NEED_IP(1);
}
t += 15 + *ip++;
}
t += 3;
copy_literal_run:
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
- if (likely(HAVE_IP(t, 15) && HAVE_OP(t, 15))) {
+ if (likely(HAVE_IP(t + 15) && HAVE_OP(t + 15))) {
const unsigned char *ie = ip + t;
unsigned char *oe = op + t;
do {
@@ -101,8 +81,8 @@ copy_literal_run:
} else
#endif
{
- NEED_OP(t, 0);
- NEED_IP(t, 3);
+ NEED_OP(t);
+ NEED_IP(t + 3);
do {
*op++ = *ip++;
} while (--t > 0);
@@ -115,7 +95,7 @@ copy_literal_run:
m_pos -= t >> 2;
m_pos -= *ip++ << 2;
TEST_LB(m_pos);
- NEED_OP(2, 0);
+ NEED_OP(2);
op[0] = m_pos[0];
op[1] = m_pos[1];
op += 2;
@@ -139,10 +119,10 @@ copy_literal_run:
while (unlikely(*ip == 0)) {
t += 255;
ip++;
- NEED_IP(1, 0);
+ NEED_IP(1);
}
t += 31 + *ip++;
- NEED_IP(2, 0);
+ NEED_IP(2);
}
m_pos = op - 1;
next = get_unaligned_le16(ip);
@@ -157,10 +137,10 @@ copy_literal_run:
while (unlikely(*ip == 0)) {
t += 255;
ip++;
- NEED_IP(1, 0);
+ NEED_IP(1);
}
t += 7 + *ip++;
- NEED_IP(2, 0);
+ NEED_IP(2);
}
next = get_unaligned_le16(ip);
ip += 2;
@@ -174,7 +154,7 @@ copy_literal_run:
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
if (op - m_pos >= 8) {
unsigned char *oe = op + t;
- if (likely(HAVE_OP(t, 15))) {
+ if (likely(HAVE_OP(t + 15))) {
do {
COPY8(op, m_pos);
op += 8;
@@ -184,7 +164,7 @@ copy_literal_run:
m_pos += 8;
} while (op < oe);
op = oe;
- if (HAVE_IP(6, 0)) {
+ if (HAVE_IP(6)) {
state = next;
COPY4(op, ip);
op += next;
@@ -192,7 +172,7 @@ copy_literal_run:
continue;
}
} else {
- NEED_OP(t, 0);
+ NEED_OP(t);
do {
*op++ = *m_pos++;
} while (op < oe);
@@ -201,7 +181,7 @@ copy_literal_run:
#endif
{
unsigned char *oe = op + t;
- NEED_OP(t, 0);
+ NEED_OP(t);
op[0] = m_pos[0];
op[1] = m_pos[1];
op += 2;
@@ -214,15 +194,15 @@ match_next:
state = next;
t = next;
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
- if (likely(HAVE_IP(6, 0) && HAVE_OP(4, 0))) {
+ if (likely(HAVE_IP(6) && HAVE_OP(4))) {
COPY4(op, ip);
op += t;
ip += t;
} else
#endif
{
- NEED_IP(t, 3);
- NEED_OP(t, 0);
+ NEED_IP(t + 3);
+ NEED_OP(t);
while (t > 0) {
*op++ = *ip++;
t--;

2014-10-28 07:53:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 057/146] tty: omap-serial: fix division by zero

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Frans Klaver <[email protected]>

commit dc3187564e61260f49eceb21a4e7eb5e4428e90a upstream.

If the chosen baud rate is large enough (e.g. 3.5 megabaud), the
calculated n values in serial_omap_is_baud_mode16() may become 0. This
causes a division by zero when calculating the difference between
calculated and desired baud rates. To prevent this, cap the n13 and n16
values on 1.

Division by zero in kernel.
[<c00132e0>] (unwind_backtrace) from [<c00112ec>] (show_stack+0x10/0x14)
[<c00112ec>] (show_stack) from [<c01ed7bc>] (Ldiv0+0x8/0x10)
[<c01ed7bc>] (Ldiv0) from [<c023805c>] (serial_omap_baud_is_mode16+0x4c/0x68)
[<c023805c>] (serial_omap_baud_is_mode16) from [<c02396b4>] (serial_omap_set_termios+0x90/0x8d8)
[<c02396b4>] (serial_omap_set_termios) from [<c0230a0c>] (uart_change_speed+0xa4/0xa8)
[<c0230a0c>] (uart_change_speed) from [<c0231798>] (uart_set_termios+0xa0/0x1fc)
[<c0231798>] (uart_set_termios) from [<c022bb44>] (tty_set_termios+0x248/0x2c0)
[<c022bb44>] (tty_set_termios) from [<c022c17c>] (set_termios+0x248/0x29c)
[<c022c17c>] (set_termios) from [<c022c3e4>] (tty_mode_ioctl+0x1c8/0x4e8)
[<c022c3e4>] (tty_mode_ioctl) from [<c0227e70>] (tty_ioctl+0xa94/0xb18)
[<c0227e70>] (tty_ioctl) from [<c00cf45c>] (do_vfs_ioctl+0x4a0/0x560)
[<c00cf45c>] (do_vfs_ioctl) from [<c00cf568>] (SyS_ioctl+0x4c/0x74)
[<c00cf568>] (SyS_ioctl) from [<c000e480>] (ret_fast_syscall+0x0/0x30)

Signed-off-by: Frans Klaver <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/serial/omap-serial.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/tty/serial/omap-serial.c
+++ b/drivers/tty/serial/omap-serial.c
@@ -254,8 +254,16 @@ serial_omap_baud_is_mode16(struct uart_p
{
unsigned int n13 = port->uartclk / (13 * baud);
unsigned int n16 = port->uartclk / (16 * baud);
- int baudAbsDiff13 = baud - (port->uartclk / (13 * n13));
- int baudAbsDiff16 = baud - (port->uartclk / (16 * n16));
+ int baudAbsDiff13;
+ int baudAbsDiff16;
+
+ if (n13 == 0)
+ n13 = 1;
+ if (n16 == 0)
+ n16 = 1;
+
+ baudAbsDiff13 = baud - (port->uartclk / (13 * n13));
+ baudAbsDiff16 = baud - (port->uartclk / (16 * n16));
if (baudAbsDiff13 < 0)
baudAbsDiff13 = -baudAbsDiff13;
if (baudAbsDiff16 < 0)

2014-10-28 03:38:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 056/146] lzo: check for length overrun in variable length encoding.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willy Tarreau <[email protected]>

commit 72cf90124e87d975d0b2114d930808c58b4c05e4 upstream.

This fix ensures that we never meet an integer overflow while adding
255 while parsing a variable length encoding. It works differently from
commit 206a81c ("lzo: properly check for overruns") because instead of
ensuring that we don't overrun the input, which is tricky to guarantee
due to many assumptions in the code, it simply checks that the cumulated
number of 255 read cannot overflow by bounding this number.

The MAX_255_COUNT is the maximum number of times we can add 255 to a base
count without overflowing an integer. The multiply will overflow when
multiplying 255 by more than MAXINT/255. The sum will overflow earlier
depending on the base count. Since the base count is taken from a u8
and a few bits, it is safe to assume that it will always be lower than
or equal to 2*255, thus we can always prevent any overflow by accepting
two less 255 steps.

This patch also reduces the CPU overhead and actually increases performance
by 1.1% compared to the initial code, while the previous fix costs 3.1%
(measured on x86_64).

The fix needs to be backported to all currently supported stable kernels.

Reported-by: Willem Pinckaers <[email protected]>
Cc: "Don A. Bailey" <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
lib/lzo/lzo1x_decompress_safe.c | 43 ++++++++++++++++++++++++++++++++++------
1 file changed, 37 insertions(+), 6 deletions(-)

--- a/lib/lzo/lzo1x_decompress_safe.c
+++ b/lib/lzo/lzo1x_decompress_safe.c
@@ -25,6 +25,16 @@
#define NEED_OP(x) if (!HAVE_OP(x)) goto output_overrun
#define TEST_LB(m_pos) if ((m_pos) < out) goto lookbehind_overrun

+/* This MAX_255_COUNT is the maximum number of times we can add 255 to a base
+ * count without overflowing an integer. The multiply will overflow when
+ * multiplying 255 by more than MAXINT/255. The sum will overflow earlier
+ * depending on the base count. Since the base count is taken from a u8
+ * and a few bits, it is safe to assume that it will always be lower than
+ * or equal to 2*255, thus we can always prevent any overflow by accepting
+ * two less 255 steps. See Documentation/lzo.txt for more information.
+ */
+#define MAX_255_COUNT ((((size_t)~0) / 255) - 2)
+
int lzo1x_decompress_safe(const unsigned char *in, size_t in_len,
unsigned char *out, size_t *out_len)
{
@@ -55,12 +65,19 @@ int lzo1x_decompress_safe(const unsigned
if (t < 16) {
if (likely(state == 0)) {
if (unlikely(t == 0)) {
+ size_t offset;
+ const unsigned char *ip_last = ip;
+
while (unlikely(*ip == 0)) {
- t += 255;
ip++;
NEED_IP(1);
}
- t += 15 + *ip++;
+ offset = ip - ip_last;
+ if (unlikely(offset > MAX_255_COUNT))
+ return LZO_E_ERROR;
+
+ offset = (offset << 8) - offset;
+ t += offset + 15 + *ip++;
}
t += 3;
copy_literal_run:
@@ -116,12 +133,19 @@ copy_literal_run:
} else if (t >= 32) {
t = (t & 31) + (3 - 1);
if (unlikely(t == 2)) {
+ size_t offset;
+ const unsigned char *ip_last = ip;
+
while (unlikely(*ip == 0)) {
- t += 255;
ip++;
NEED_IP(1);
}
- t += 31 + *ip++;
+ offset = ip - ip_last;
+ if (unlikely(offset > MAX_255_COUNT))
+ return LZO_E_ERROR;
+
+ offset = (offset << 8) - offset;
+ t += offset + 31 + *ip++;
NEED_IP(2);
}
m_pos = op - 1;
@@ -134,12 +158,19 @@ copy_literal_run:
m_pos -= (t & 8) << 11;
t = (t & 7) + (3 - 1);
if (unlikely(t == 2)) {
+ size_t offset;
+ const unsigned char *ip_last = ip;
+
while (unlikely(*ip == 0)) {
- t += 255;
ip++;
NEED_IP(1);
}
- t += 7 + *ip++;
+ offset = ip - ip_last;
+ if (unlikely(offset > MAX_255_COUNT))
+ return LZO_E_ERROR;
+
+ offset = (offset << 8) - offset;
+ t += offset + 7 + *ip++;
NEED_IP(2);
}
next = get_unaligned_le16(ip);

2014-10-28 07:54:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 039/146] dmaengine: fix xor sources continuation

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Xuelin Shi <[email protected]>

commit 87cea76384257e6ac3fa4791b6a6b9d0335f7457 upstream.

the partial xor result must be kept until the next
tx is generated.

Signed-off-by: Xuelin Shi <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
crypto/async_tx/async_xor.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -78,8 +78,6 @@ do_async_xor(struct dma_chan *chan, stru
tx = dma->device_prep_dma_xor(chan, dma_dest, src_list,
xor_src_cnt, unmap->len,
dma_flags);
- src_list[0] = tmp;
-

if (unlikely(!tx))
async_tx_quiesce(&submit->depend_tx);
@@ -92,6 +90,7 @@ do_async_xor(struct dma_chan *chan, stru
xor_src_cnt, unmap->len,
dma_flags);
}
+ src_list[0] = tmp;

dma_set_unmap(tx, unmap);
async_tx_submit(chan, tx, submit);

2014-10-28 03:37:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 053/146] Fixing lease renewal

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Olga Kornievskaia <[email protected]>

commit 8faaa6d5d48b201527e0451296d9e71d23afb362 upstream.

Commit c9fdeb28 removed a 'continue' after checking if the lease needs
to be renewed. However, if client hasn't moved, the code falls down to
starting reboot recovery erroneously (ie., sends open reclaim and gets
back stale_clientid error) before recovering from getting stale_clientid
on the renew operation.

Signed-off-by: Olga Kornievskaia <[email protected]>
Fixes: c9fdeb280b8c (NFS: Add basic migration support to state manager thread)
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfs/nfs4state.c | 1 +
1 file changed, 1 insertion(+)

--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -2345,6 +2345,7 @@ static void nfs4_state_manager(struct nf
status = nfs4_check_lease(clp);
if (status < 0)
goto out_error;
+ continue;
}

if (test_and_clear_bit(NFS4CLNT_MOVED, &clp->cl_state)) {

2014-10-28 07:55:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 052/146] m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Geert Uytterhoeven <[email protected]>

commit e4dc601bf99ccd1c95b7e6eef1d3cf3c4b0d4961 upstream.

hwreg_present() and hwreg_write() temporarily change the VBR register to
another vector table. This table contains a valid bus error handler
only, all other entries point to arbitrary addresses.

If an interrupt comes in while the temporary table is active, the
processor will start executing at such an arbitrary address, and the
kernel will crash.

While most callers run early, before interrupts are enabled, or
explicitly disable interrupts, Finn Thain pointed out that macsonic has
one callsite that doesn't, causing intermittent boot crashes.
There's another unsafe callsite in hilkbd.

Fix this for good by disabling and restoring interrupts inside
hwreg_present() and hwreg_write().

Explicitly disabling interrupts can be removed from the callsites later.

Reported-by: Finn Thain <[email protected]>
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/m68k/mm/hwtest.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/arch/m68k/mm/hwtest.c
+++ b/arch/m68k/mm/hwtest.c
@@ -28,9 +28,11 @@
int hwreg_present( volatile void *regp )
{
int ret = 0;
+ unsigned long flags;
long save_sp, save_vbr;
long tmp_vectors[3];

+ local_irq_save(flags);
__asm__ __volatile__
( "movec %/vbr,%2\n\t"
"movel #Lberr1,%4@(8)\n\t"
@@ -46,6 +48,7 @@ int hwreg_present( volatile void *regp )
: "=&d" (ret), "=&r" (save_sp), "=&r" (save_vbr)
: "a" (regp), "a" (tmp_vectors)
);
+ local_irq_restore(flags);

return( ret );
}
@@ -58,9 +61,11 @@ EXPORT_SYMBOL(hwreg_present);
int hwreg_write( volatile void *regp, unsigned short val )
{
int ret;
+ unsigned long flags;
long save_sp, save_vbr;
long tmp_vectors[3];

+ local_irq_save(flags);
__asm__ __volatile__
( "movec %/vbr,%2\n\t"
"movel #Lberr2,%4@(8)\n\t"
@@ -78,6 +83,7 @@ int hwreg_write( volatile void *regp, un
: "=&d" (ret), "=&r" (save_sp), "=&r" (save_vbr)
: "a" (regp), "a" (tmp_vectors), "g" (val)
);
+ local_irq_restore(flags);

return( ret );
}

2014-10-28 07:55:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 050/146] Drivers: hv: vmbus: Cleanup hv_post_message()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit b29ef3546aecb253a5552b198cef23750d56e1e4 upstream.

Minimize failures in this function by pre-allocating the buffer
for posting messages. The hypercall for posting the message can fail
for a number of reasons:

1. Transient resource related issues
2. Buffer alignment
3. Buffer cannot span a page boundry

We address issues 2 and 3 by preallocating a per-cpu page for the buffer.
Transient resource related failures are handled by retrying by the callers
of this function.

This patch is based on the investigation
done by Dexuan Cui <[email protected]>.

I would like to thank Sitsofe Wheeler <[email protected]>
for reporting the issue and helping in debuggging.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Reported-by: Sitsofe Wheeler <[email protected]>
Tested-by: Sitsofe Wheeler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/hv.c | 27 +++++++++++++++------------
drivers/hv/hyperv_vmbus.h | 4 ++++
2 files changed, 19 insertions(+), 12 deletions(-)

--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -138,6 +138,8 @@ int hv_init(void)
memset(hv_context.synic_event_page, 0, sizeof(void *) * NR_CPUS);
memset(hv_context.synic_message_page, 0,
sizeof(void *) * NR_CPUS);
+ memset(hv_context.post_msg_page, 0,
+ sizeof(void *) * NR_CPUS);
memset(hv_context.vp_index, 0,
sizeof(int) * NR_CPUS);
memset(hv_context.event_dpc, 0,
@@ -217,26 +219,18 @@ int hv_post_message(union hv_connection_
enum hv_message_type message_type,
void *payload, size_t payload_size)
{
- struct aligned_input {
- u64 alignment8;
- struct hv_input_post_message msg;
- };

struct hv_input_post_message *aligned_msg;
u16 status;
- unsigned long addr;

if (payload_size > HV_MESSAGE_PAYLOAD_BYTE_COUNT)
return -EMSGSIZE;

- addr = (unsigned long)kmalloc(sizeof(struct aligned_input), GFP_ATOMIC);
- if (!addr)
- return -ENOMEM;
-
aligned_msg = (struct hv_input_post_message *)
- (ALIGN(addr, HV_HYPERCALL_PARAM_ALIGN));
+ hv_context.post_msg_page[get_cpu()];

aligned_msg->connectionid = connection_id;
+ aligned_msg->reserved = 0;
aligned_msg->message_type = message_type;
aligned_msg->payload_size = payload_size;
memcpy((void *)aligned_msg->payload, payload, payload_size);
@@ -244,8 +238,7 @@ int hv_post_message(union hv_connection_
status = do_hypercall(HVCALL_POST_MESSAGE, aligned_msg, NULL)
& 0xFFFF;

- kfree((void *)addr);
-
+ put_cpu();
return status;
}

@@ -294,6 +287,14 @@ int hv_synic_alloc(void)
pr_err("Unable to allocate SYNIC event page\n");
goto err;
}
+
+ hv_context.post_msg_page[cpu] =
+ (void *)get_zeroed_page(GFP_ATOMIC);
+
+ if (hv_context.post_msg_page[cpu] == NULL) {
+ pr_err("Unable to allocate post msg page\n");
+ goto err;
+ }
}

return 0;
@@ -308,6 +309,8 @@ static void hv_synic_free_cpu(int cpu)
free_page((unsigned long)hv_context.synic_event_page[cpu]);
if (hv_context.synic_message_page[cpu])
free_page((unsigned long)hv_context.synic_message_page[cpu]);
+ if (hv_context.post_msg_page[cpu])
+ free_page((unsigned long)hv_context.post_msg_page[cpu]);
}

void hv_synic_free(void)
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -515,6 +515,10 @@ struct hv_context {
* per-cpu list of the channels based on their CPU affinity.
*/
struct list_head percpu_list[NR_CPUS];
+ /*
+ * buffer to post messages to the host.
+ */
+ void *post_msg_page[NR_CPUS];
};

extern struct hv_context hv_context;

2014-10-28 03:37:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 038/146] qla2xxx: Fix shost use-after-free on device removal

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Joe Lawrence <[email protected]>

commit db7157d4cfce6edf052452fb1d327d4d11b67f4c upstream.

Once calling scsi_host_put, be careful to not access qla_hw_data through
the Scsi_Host private data (ie, scsi_qla_host base_vha).

Fixes: fe1b806f4f71 ("qla2xxx: Refactor shutdown code so some functionality can be reused")
Signed-off-by: Joe Lawrence <[email protected]>
Acked-by: Chad Dupuis <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/qla2xxx/qla_os.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -3119,10 +3119,8 @@ qla2x00_unmap_iobases(struct qla_hw_data
}

static void
-qla2x00_clear_drv_active(scsi_qla_host_t *vha)
+qla2x00_clear_drv_active(struct qla_hw_data *ha)
{
- struct qla_hw_data *ha = vha->hw;
-
if (IS_QLA8044(ha)) {
qla8044_idc_lock(ha);
qla8044_clear_drv_active(ha);
@@ -3193,7 +3191,7 @@ qla2x00_remove_one(struct pci_dev *pdev)

scsi_host_put(base_vha->host);

- qla2x00_clear_drv_active(base_vha);
+ qla2x00_clear_drv_active(ha);

qla2x00_unmap_iobases(ha);


2014-10-28 07:56:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 049/146] Drivers: hv: vmbus: Fix a bug in vmbus_open()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit 45d727cee9e200f5b351528b9fb063b69cf702c8 upstream.

Fix a bug in vmbus_open() and properly propagate the error. I would
like to thank Dexuan Cui <[email protected]> for identifying the
issue.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Tested-by: Sitsofe Wheeler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/channel.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -165,8 +165,10 @@ int vmbus_open(struct vmbus_channel *new
ret = vmbus_post_msg(open_msg,
sizeof(struct vmbus_channel_open_channel));

- if (ret != 0)
+ if (ret != 0) {
+ err = ret;
goto error1;
+ }

t = wait_for_completion_timeout(&open_info->waitevent, 5*HZ);
if (t == 0) {

2014-10-28 03:37:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 047/146] Drivers: hv: vmbus: Cleanup vmbus_close_internal()

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: "K. Y. Srinivasan" <[email protected]>

commit 98d731bb064a9d1817a6ca9bf8b97051334a7cfe upstream.

Eliminate calls to BUG_ON() in vmbus_close_internal().
We have chosen to potentially leak memory, than crash the guest
in case of failures.

In this version of the patch I have addressed comments from
Dan Carpenter ([email protected]).

Signed-off-by: K. Y. Srinivasan <[email protected]>
Tested-by: Sitsofe Wheeler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/channel.c | 29 +++++++++++++++++++++++------
1 file changed, 23 insertions(+), 6 deletions(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -479,7 +479,7 @@ static void reset_channel_cb(void *arg)
channel->onchannel_callback = NULL;
}

-static void vmbus_close_internal(struct vmbus_channel *channel)
+static int vmbus_close_internal(struct vmbus_channel *channel)
{
struct vmbus_channel_close_channel *msg;
int ret;
@@ -502,11 +502,28 @@ static void vmbus_close_internal(struct

ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_close_channel));

- BUG_ON(ret != 0);
+ if (ret) {
+ pr_err("Close failed: close post msg return is %d\n", ret);
+ /*
+ * If we failed to post the close msg,
+ * it is perhaps better to leak memory.
+ */
+ return ret;
+ }
+
/* Tear down the gpadl for the channel's ring buffer */
- if (channel->ringbuffer_gpadlhandle)
- vmbus_teardown_gpadl(channel,
- channel->ringbuffer_gpadlhandle);
+ if (channel->ringbuffer_gpadlhandle) {
+ ret = vmbus_teardown_gpadl(channel,
+ channel->ringbuffer_gpadlhandle);
+ if (ret) {
+ pr_err("Close failed: teardown gpadl return %d\n", ret);
+ /*
+ * If we failed to teardown gpadl,
+ * it is perhaps better to leak memory.
+ */
+ return ret;
+ }
+ }

/* Cleanup the ring buffers for this channel */
hv_ringbuffer_cleanup(&channel->outbound);
@@ -515,7 +532,7 @@ static void vmbus_close_internal(struct
free_pages((unsigned long)channel->ringbuffer_pages,
get_order(channel->ringbuffer_pagecount * PAGE_SIZE));

-
+ return ret;
}

/*

2014-10-28 03:37:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 022/146] x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bryan O'Donoghue <[email protected]>

commit ee1b5b165c0a2f04d2107e634e51f05d0eb107de upstream.

Quark x1000 advertises PGE via the standard CPUID method
PGE bits exist in Quark X1000's PTEs. In order to flush
an individual PTE it is necessary to reload CR3 irrespective
of the PTE.PGE bit.

See Quark Core_DevMan_001.pdf section 6.4.11

This bug was fixed in Galileo kernels, unfixed vanilla kernels are expected to
crash and burn on this platform.

Signed-off-by: Bryan O'Donoghue <[email protected]>
Cc: Borislav Petkov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/intel.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -144,6 +144,21 @@ static void early_init_intel(struct cpui
setup_clear_cpu_cap(X86_FEATURE_ERMS);
}
}
+
+ /*
+ * Intel Quark Core DevMan_001.pdf section 6.4.11
+ * "The operating system also is required to invalidate (i.e., flush)
+ * the TLB when any changes are made to any of the page table entries.
+ * The operating system must reload CR3 to cause the TLB to be flushed"
+ *
+ * As a result cpu_has_pge() in arch/x86/include/asm/tlbflush.h should
+ * be false so that __flush_tlb_all() causes CR3 insted of CR4.PGE
+ * to be modified
+ */
+ if (c->x86 == 5 && c->x86_model == 9) {
+ pr_info("Disabling PGE capability bit\n");
+ setup_clear_cpu_cap(X86_FEATURE_PGE);
+ }
}

#ifdef CONFIG_X86_32

2014-10-28 07:57:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 014/146] fs: Add a missing permission check to do_umount

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit a1480dcc3c706e309a88884723446f2e84fedd5b upstream.

Accessing do_remount_sb should require global CAP_SYS_ADMIN, but
only one of the two call sites was appropriately protected.

Fixes CVE-2014-7975.

Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namespace.c | 2 ++
1 file changed, 2 insertions(+)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1356,6 +1356,8 @@ static int do_umount(struct mount *mnt,
* Special case for "unmounting" root ...
* we just try to remount it readonly.
*/
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
down_write(&sb->s_umount);
if (!(sb->s_flags & MS_RDONLY))
retval = do_remount_sb(sb, MS_RDONLY, NULL, 0);

2014-10-28 07:57:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 021/146] x86,kvm,vmx: Preserve CR4 across VM entry

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit d974baa398f34393db76be45f7d4d04fbdbb4a0a upstream.

CR4 isn't constant; at least the TSD and PCE bits can vary.

TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.

This adds a branch and a read from cr4 to each vm entry. Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact. A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.

Signed-off-by: Andy Lutomirski <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Cc: Petr Matousek <[email protected]>
Cc: Gleb Natapov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/vmx.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -453,6 +453,7 @@ struct vcpu_vmx {
int gs_ldt_reload_needed;
int fs_reload_needed;
u64 msr_host_bndcfgs;
+ unsigned long vmcs_host_cr4; /* May not match real cr4 */
} host_state;
struct {
int vm86_active;
@@ -4235,11 +4236,16 @@ static void vmx_set_constant_host_state(
u32 low32, high32;
unsigned long tmpl;
struct desc_ptr dt;
+ unsigned long cr4;

vmcs_writel(HOST_CR0, read_cr0() & ~X86_CR0_TS); /* 22.2.3 */
- vmcs_writel(HOST_CR4, read_cr4()); /* 22.2.3, 22.2.5 */
vmcs_writel(HOST_CR3, read_cr3()); /* 22.2.3 FIXME: shadow tables */

+ /* Save the most likely value for this task's CR4 in the VMCS. */
+ cr4 = read_cr4();
+ vmcs_writel(HOST_CR4, cr4); /* 22.2.3, 22.2.5 */
+ vmx->host_state.vmcs_host_cr4 = cr4;
+
vmcs_write16(HOST_CS_SELECTOR, __KERNEL_CS); /* 22.2.4 */
#ifdef CONFIG_X86_64
/*
@@ -7376,7 +7382,7 @@ static void atomic_switch_perf_msrs(stru
static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long debugctlmsr;
+ unsigned long debugctlmsr, cr4;

/* Record the guest's net vcpu time for enforced NMI injections. */
if (unlikely(!cpu_has_virtual_nmis() && vmx->soft_vnmi_blocked))
@@ -7397,6 +7403,12 @@ static void __noclone vmx_vcpu_run(struc
if (test_bit(VCPU_REGS_RIP, (unsigned long *)&vcpu->arch.regs_dirty))
vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]);

+ cr4 = read_cr4();
+ if (unlikely(cr4 != vmx->host_state.vmcs_host_cr4)) {
+ vmcs_writel(HOST_CR4, cr4);
+ vmx->host_state.vmcs_host_cr4 = cr4;
+ }
+
/* When single-stepping over STI and MOV SS, we must clear the
* corresponding interruptibility bits in the guest state. Otherwise
* vmentry fails as it then expects bit 14 (BS) in pending debug

2014-10-28 03:37:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 036/146] qla2xxx: fix kernel NULL pointer access

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Himanshu Madhani <[email protected]>

commit 78c2106a50e067f7168ee8c0944baaeb0e988272 upstream.

This patch is to fix regression added by commit id
51a07f84649d2be206c4c2ad9a612956db0c2f8c.

When allocating memory for new session original patch does
not assign vha to op->vha resulting into NULL pointer
access during qlt_create_sess_from_atio().

Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Saurav Kashyap <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/qla2xxx/qla_target.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -3277,6 +3277,7 @@ static int qlt_handle_cmd_for_atio(struc
return -ENOMEM;

memcpy(&op->atio, atio, sizeof(*atio));
+ op->vha = vha;
INIT_WORK(&op->work, qlt_create_sess_from_atio);
queue_work(qla_tgt_wq, &op->work);
return 0;

2014-10-28 03:37:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 018/146] KVM: do not bias the generation number in kvm_current_mmio_generation

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <[email protected]>

commit 00f034a12fdd81210d58116326d92780aac5c238 upstream.

The next patch will give a meaning (a la seqcount) to the low bit of the
generation number. Ensure that it matches between kvm->memslots->generation
and kvm_current_mmio_generation().

Reviewed-by: David Matlack <[email protected]>
Reviewed-by: Xiao Guangrong <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/mmu.c | 7 +------
virt/kvm/kvm_main.c | 7 +++++++
2 files changed, 8 insertions(+), 6 deletions(-)

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -240,12 +240,7 @@ static unsigned int get_mmio_spte_genera

static unsigned int kvm_current_mmio_generation(struct kvm *kvm)
{
- /*
- * Init kvm generation close to MMIO_MAX_GEN to easily test the
- * code of handling generation number wrap-around.
- */
- return (kvm_memslots(kvm)->generation +
- MMIO_MAX_GEN - 150) & MMIO_GEN_MASK;
+ return kvm_memslots(kvm)->generation & MMIO_GEN_MASK;
}

static void mark_mmio_spte(struct kvm *kvm, u64 *sptep, u64 gfn,
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -474,6 +474,13 @@ static struct kvm *kvm_create_vm(unsigne
kvm->memslots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
if (!kvm->memslots)
goto out_err_no_srcu;
+
+ /*
+ * Init kvm generation close to the maximum to easily test the
+ * code of handling generation number wrap-around.
+ */
+ kvm->memslots->generation = -150;
+
kvm_init_memslots_id(kvm);
if (init_srcu_struct(&kvm->srcu))
goto out_err_no_srcu;

2014-10-28 03:37:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 017/146] kvm: fix potentially corrupt mmio cache

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Matlack <[email protected]>

commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream.

vcpu exits and memslot mutations can run concurrently as long as the
vcpu does not aquire the slots mutex. Thus it is theoretically possible
for memslots to change underneath a vcpu that is handling an exit.

If we increment the memslot generation number again after
synchronize_srcu_expedited(), vcpus can safely cache memslot generation
without maintaining a single rcu_dereference through an entire vm exit.
And much of the x86/kvm code does not maintain a single rcu_dereference
of the current memslots during each exit.

We can prevent the following case:

vcpu (CPU 0) | thread (CPU 1)
--------------------------------------------+--------------------------
1 vm exit |
2 srcu_read_unlock(&kvm->srcu) |
3 decide to cache something based on |
old memslots |
4 | change memslots
| (increments generation)
5 | synchronize_srcu(&kvm->srcu);
6 retrieve generation # from new memslots |
7 tag cache with new memslot generation |
8 srcu_read_unlock(&kvm->srcu) |
... |
<action based on cache occurs even |
though the caching decision was based |
on the old memslots> |
... |
<action *continues* to occur until next |
memslot generation change, which may |
be never> |
|

By incrementing the generation after synchronizing with kvm->srcu readers,
we ensure that the generation retrieved in (6) will become invalid soon
after (8).

Keeping the existing increment is not strictly necessary, but we
do keep it and just move it for consistency from update_memslots to
install_new_memslots. It invalidates old cached MMIOs immediately,
instead of having to wait for the end of synchronize_srcu_expedited,
which makes the code more clearly correct in case CPU 1 is preempted
right after synchronize_srcu() returns.

To avoid halving the generation space in SPTEs, always presume that the
low bit of the generation is zero when reconstructing a generation number
out of an SPTE. This effectively disables MMIO caching in SPTEs during
the call to synchronize_srcu_expedited. Using the low bit this way is
somewhat like a seqcount---where the protected thing is a cache, and
instead of retrying we can simply punt if we observe the low bit to be 1.

Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Xiao Guangrong <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
Documentation/virtual/kvm/mmu.txt | 14 ++++++++++++++
arch/x86/kvm/mmu.c | 20 ++++++++++++--------
virt/kvm/kvm_main.c | 23 ++++++++++++++++-------
3 files changed, 42 insertions(+), 15 deletions(-)

--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -425,6 +425,20 @@ fault through the slow path.
Since only 19 bits are used to store generation-number on mmio spte, all
pages are zapped when there is an overflow.

+Unfortunately, a single memory access might access kvm_memslots(kvm) multiple
+times, the last one happening when the generation number is retrieved and
+stored into the MMIO spte. Thus, the MMIO spte might be created based on
+out-of-date information, but with an up-to-date generation number.
+
+To avoid this, the generation number is incremented again after synchronize_srcu
+returns; thus, the low bit of kvm_memslots(kvm)->generation is only 1 during a
+memslot update, while some SRCU readers might be using the old copy. We do not
+want to use an MMIO sptes created with an odd generation number, and we can do
+this without losing a bit in the MMIO spte. The low bit of the generation
+is not stored in MMIO spte, and presumed zero when it is extracted out of the
+spte. If KVM is unlucky and creates an MMIO spte while the low bit is 1,
+the next access to the spte will always be a cache miss.
+

Further reading
===============
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -199,16 +199,20 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio
EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);

/*
- * spte bits of bit 3 ~ bit 11 are used as low 9 bits of generation number,
- * the bits of bits 52 ~ bit 61 are used as high 10 bits of generation
- * number.
+ * the low bit of the generation number is always presumed to be zero.
+ * This disables mmio caching during memslot updates. The concept is
+ * similar to a seqcount but instead of retrying the access we just punt
+ * and ignore the cache.
+ *
+ * spte bits 3-11 are used as bits 1-9 of the generation number,
+ * the bits 52-61 are used as bits 10-19 of the generation number.
*/
-#define MMIO_SPTE_GEN_LOW_SHIFT 3
+#define MMIO_SPTE_GEN_LOW_SHIFT 2
#define MMIO_SPTE_GEN_HIGH_SHIFT 52

-#define MMIO_GEN_SHIFT 19
-#define MMIO_GEN_LOW_SHIFT 9
-#define MMIO_GEN_LOW_MASK ((1 << MMIO_GEN_LOW_SHIFT) - 1)
+#define MMIO_GEN_SHIFT 20
+#define MMIO_GEN_LOW_SHIFT 10
+#define MMIO_GEN_LOW_MASK ((1 << MMIO_GEN_LOW_SHIFT) - 2)
#define MMIO_GEN_MASK ((1 << MMIO_GEN_SHIFT) - 1)
#define MMIO_MAX_GEN ((1 << MMIO_GEN_SHIFT) - 1)

@@ -4433,7 +4437,7 @@ void kvm_mmu_invalidate_mmio_sptes(struc
* The very rare case: if the generation-number is round,
* zap all shadow pages.
*/
- if (unlikely(kvm_current_mmio_generation(kvm) >= MMIO_MAX_GEN)) {
+ if (unlikely(kvm_current_mmio_generation(kvm) == 0)) {
printk_ratelimited(KERN_INFO "kvm: zapping shadow pages for mmio generation wraparound\n");
kvm_mmu_invalidate_zap_all_pages(kvm);
}
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -95,8 +95,6 @@ static int hardware_enable_all(void);
static void hardware_disable_all(void);

static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
-static void update_memslots(struct kvm_memslots *slots,
- struct kvm_memory_slot *new, u64 last_generation);

static void kvm_release_pfn_dirty(pfn_t pfn);
static void mark_page_dirty_in_slot(struct kvm *kvm,
@@ -687,8 +685,7 @@ static void sort_memslots(struct kvm_mem
}

static void update_memslots(struct kvm_memslots *slots,
- struct kvm_memory_slot *new,
- u64 last_generation)
+ struct kvm_memory_slot *new)
{
if (new) {
int id = new->id;
@@ -699,8 +696,6 @@ static void update_memslots(struct kvm_m
if (new->npages != npages)
sort_memslots(slots);
}
-
- slots->generation = last_generation + 1;
}

static int check_memory_region_flags(struct kvm_userspace_memory_region *mem)
@@ -722,10 +717,24 @@ static struct kvm_memslots *install_new_
{
struct kvm_memslots *old_memslots = kvm->memslots;

- update_memslots(slots, new, kvm->memslots->generation);
+ /*
+ * Set the low bit in the generation, which disables SPTE caching
+ * until the end of synchronize_srcu_expedited.
+ */
+ WARN_ON(old_memslots->generation & 1);
+ slots->generation = old_memslots->generation + 1;
+
+ update_memslots(slots, new);
rcu_assign_pointer(kvm->memslots, slots);
synchronize_srcu_expedited(&kvm->srcu);

+ /*
+ * Increment the new memslot generation a second time. This prevents
+ * vm exits that race with memslot updates from caching a memslot
+ * generation that will (potentially) be valid forever.
+ */
+ slots->generation++;
+
kvm_arch_memslots_updated(kvm);

return old_memslots;

2014-10-28 07:59:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 019/146] KVM: s390: unintended fallthrough for external call

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christian Borntraeger <[email protected]>

commit f346026e55f1efd3949a67ddd1dcea7c1b9a615e upstream.

We must not fallthrough if the conditions for external call are not met.

Signed-off-by: Christian Borntraeger <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/s390/kvm/interrupt.c | 1 +
1 file changed, 1 insertion(+)

--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -85,6 +85,7 @@ static int __interrupt_is_deliverable(st
return 0;
if (vcpu->arch.sie_block->gcr[0] & 0x2000ul)
return 1;
+ return 0;
case KVM_S390_INT_EMERGENCY:
if (psw_extint_disabled(vcpu))
return 0;

2014-10-28 07:59:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 034/146] mptfusion: enable no_write_same for vmware scsi disks

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Chris J Arges <[email protected]>

commit 4089b71cc820a426d601283c92fcd4ffeb5139c2 upstream.

When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.

BugLink: http://bugs.launchpad.net/bugs/1371591
Reported-by: Bruce Lucas <[email protected]>
Tested-by: Chris J Arges <[email protected]>
Signed-off-by: Chris J Arges <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/message/fusion/mptspi.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1419,6 +1419,11 @@ mptspi_probe(struct pci_dev *pdev, const
goto out_mptspi_probe;
}

+ /* VMWare emulation doesn't properly implement WRITE_SAME
+ */
+ if (pdev->subsystem_vendor == 0x15AD)
+ sh->no_write_same = 1;
+
spin_lock_irqsave(&ioc->FreeQlock, flags);

/* Attach the SCSI Host to the IOC structure

2014-10-28 03:37:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 032/146] regmap: fix possible ZERO_SIZE_PTR pointer dereferencing error.

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Xiubo Li <[email protected]>

commit d6b41cb06044a7d895db82bdd54f6e4219970510 upstream.

Since we cannot make sure the 'val_count' will always be none zero
here, and then if it equals to zero, the kmemdup() will return
ZERO_SIZE_PTR, which equals to ((void *)16).

So this patch fix this with just doing the zero check before calling
kmemdup().

Signed-off-by: Xiubo Li <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/base/regmap/regmap.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -1659,6 +1659,9 @@ out:
} else {
void *wval;

+ if (!val_count)
+ return -EINVAL;
+
wval = kmemdup(val, val_count * val_bytes, GFP_KERNEL);
if (!wval) {
dev_err(map->dev, "Error in memory allocation\n");

2014-10-28 08:00:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 033/146] be2iscsi: check ip buffer before copying

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mike Christie <[email protected]>

commit a41a9ad3bbf61fae0b6bfb232153da60d14fdbd9 upstream.

Dan Carpenter found a issue where be2iscsi would copy the ip
from userspace to the driver buffer before checking the len
of the data being copied:
http://marc.info/?l=linux-scsi&m=140982651504251&w=2

This patch just has us only copy what we the driver buffer
can support.

Tested-by: John Soni Jose <[email protected]>
Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/be2iscsi/be_mgmt.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/scsi/be2iscsi/be_mgmt.c
+++ b/drivers/scsi/be2iscsi/be_mgmt.c
@@ -943,17 +943,20 @@ mgmt_static_ip_modify(struct beiscsi_hba

if (ip_action == IP_ACTION_ADD) {
memcpy(req->ip_params.ip_record.ip_addr.addr, ip_param->value,
- ip_param->len);
+ sizeof(req->ip_params.ip_record.ip_addr.addr));

if (subnet_param)
memcpy(req->ip_params.ip_record.ip_addr.subnet_mask,
- subnet_param->value, subnet_param->len);
+ subnet_param->value,
+ sizeof(req->ip_params.ip_record.ip_addr.subnet_mask));
} else {
memcpy(req->ip_params.ip_record.ip_addr.addr,
- if_info->ip_addr.addr, ip_param->len);
+ if_info->ip_addr.addr,
+ sizeof(req->ip_params.ip_record.ip_addr.addr));

memcpy(req->ip_params.ip_record.ip_addr.subnet_mask,
- if_info->ip_addr.subnet_mask, ip_param->len);
+ if_info->ip_addr.subnet_mask,
+ sizeof(req->ip_params.ip_record.ip_addr.subnet_mask));
}

rc = mgmt_exec_nonemb_cmd(phba, &nonemb_cmd, NULL, 0);
@@ -981,7 +984,7 @@ static int mgmt_modify_gateway(struct be
req->action = gtway_action;
req->ip_addr.ip_type = BE2_IPV4;

- memcpy(req->ip_addr.addr, gt_addr, param_len);
+ memcpy(req->ip_addr.addr, gt_addr, sizeof(req->ip_addr.addr));

return mgmt_exec_nonemb_cmd(phba, &nonemb_cmd, NULL, 0);
}

2014-10-28 08:00:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 016/146] kvm: x86: fix stale mmio cache bug

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Matlack <[email protected]>

commit 56f17dd3fbc44adcdbc3340fe3988ddb833a47a7 upstream.

The following events can lead to an incorrect KVM_EXIT_MMIO bubbling
up to userspace:

(1) Guest accesses gpa X without a memory slot. The gfn is cached in
struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets
the SPTE write-execute-noread so that future accesses cause
EPT_MISCONFIGs.

(2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION
covering the page just accessed.

(3) Guest attempts to read or write to gpa X again. On Intel, this
generates an EPT_MISCONFIG. The memory slot generation number that
was incremented in (2) would normally take care of this but we fast
path mmio faults through quickly_check_mmio_pf(), which only checks
the per-vcpu mmio cache. Since we hit the cache, KVM passes a
KVM_EXIT_MMIO up to userspace.

This patch fixes the issue by using the memslot generation number
to validate the mmio cache.

Signed-off-by: David Matlack <[email protected]>
[xiaoguangrong: adjust the code to make it simpler for stable-tree fix.]
Signed-off-by: Xiao Guangrong <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Reviewed-by: Xiao Guangrong <[email protected]>
Tested-by: David Matlack <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.c | 2 +-
arch/x86/kvm/x86.h | 20 +++++++++++++++-----
3 files changed, 17 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -481,6 +481,7 @@ struct kvm_vcpu_arch {
u64 mmio_gva;
unsigned access;
gfn_t mmio_gfn;
+ u64 mmio_gen;

struct kvm_pmu pmu;

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3163,7 +3163,7 @@ static void mmu_sync_roots(struct kvm_vc
if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
return;

- vcpu_clear_mmio_info(vcpu, ~0ul);
+ vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY);
kvm_mmu_audit(vcpu, AUDIT_PRE_SYNC);
if (vcpu->arch.mmu.root_level == PT64_ROOT_LEVEL) {
hpa_t root = vcpu->arch.mmu.root_hpa;
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -88,15 +88,23 @@ static inline void vcpu_cache_mmio_info(
vcpu->arch.mmio_gva = gva & PAGE_MASK;
vcpu->arch.access = access;
vcpu->arch.mmio_gfn = gfn;
+ vcpu->arch.mmio_gen = kvm_memslots(vcpu->kvm)->generation;
+}
+
+static inline bool vcpu_match_mmio_gen(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.mmio_gen == kvm_memslots(vcpu->kvm)->generation;
}

/*
- * Clear the mmio cache info for the given gva,
- * specially, if gva is ~0ul, we clear all mmio cache info.
+ * Clear the mmio cache info for the given gva. If gva is MMIO_GVA_ANY, we
+ * clear all mmio cache info.
*/
+#define MMIO_GVA_ANY (~(gva_t)0)
+
static inline void vcpu_clear_mmio_info(struct kvm_vcpu *vcpu, gva_t gva)
{
- if (gva != (~0ul) && vcpu->arch.mmio_gva != (gva & PAGE_MASK))
+ if (gva != MMIO_GVA_ANY && vcpu->arch.mmio_gva != (gva & PAGE_MASK))
return;

vcpu->arch.mmio_gva = 0;
@@ -104,7 +112,8 @@ static inline void vcpu_clear_mmio_info(

static inline bool vcpu_match_mmio_gva(struct kvm_vcpu *vcpu, unsigned long gva)
{
- if (vcpu->arch.mmio_gva && vcpu->arch.mmio_gva == (gva & PAGE_MASK))
+ if (vcpu_match_mmio_gen(vcpu) && vcpu->arch.mmio_gva &&
+ vcpu->arch.mmio_gva == (gva & PAGE_MASK))
return true;

return false;
@@ -112,7 +121,8 @@ static inline bool vcpu_match_mmio_gva(s

static inline bool vcpu_match_mmio_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
{
- if (vcpu->arch.mmio_gfn && vcpu->arch.mmio_gfn == gpa >> PAGE_SHIFT)
+ if (vcpu_match_mmio_gen(vcpu) && vcpu->arch.mmio_gfn &&
+ vcpu->arch.mmio_gfn == gpa >> PAGE_SHIFT)
return true;

return false;

2014-10-28 03:36:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 031/146] regmap: fix NULL pointer dereference in _regmap_write/read

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Pankaj Dubey <[email protected]>

commit 5336be8416a71b5568d2cf54a2f2066abe9f2a53 upstream.

If LOG_DEVICE is defined and map->dev is NULL it will lead to NULL
pointer dereference. This patch fixes this issue by adding check for
dev->NULL in all such places in regmap.c

Signed-off-by: Pankaj Dubey <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/base/regmap/regmap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -1408,7 +1408,7 @@ int _regmap_write(struct regmap *map, un
}

#ifdef LOG_DEVICE
- if (strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
+ if (map->dev && strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
dev_info(map->dev, "%x <= %x\n", reg, val);
#endif

@@ -2058,7 +2058,7 @@ static int _regmap_read(struct regmap *m
ret = map->reg_read(context, reg, val);
if (ret == 0) {
#ifdef LOG_DEVICE
- if (strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
+ if (map->dev && strcmp(dev_name(map->dev), LOG_DEVICE) == 0)
dev_info(map->dev, "%x => %x\n", reg, *val);
#endif


2014-10-28 08:01:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 030/146] regmap: debugfs: fix possbile NULL pointer dereference

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Xiubo Li <[email protected]>

commit 2c98e0c1cc6b8e86f1978286c3d4e0769ee9d733 upstream.

If 'map->dev' is NULL and there will lead dev_name() to be NULL pointer
dereference. So before dev_name(), we need to have check of the map->dev
pionter.

We also should make sure that the 'name' pointer shouldn't be NULL for
debugfs_create_dir(). So here using one default "dummy" debugfs name when
the 'name' pointer and 'map->dev' are both NULL.

Signed-off-by: Xiubo Li <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/base/regmap/regmap-debugfs.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/base/regmap/regmap-debugfs.c
+++ b/drivers/base/regmap/regmap-debugfs.c
@@ -473,6 +473,7 @@ void regmap_debugfs_init(struct regmap *
{
struct rb_node *next;
struct regmap_range_node *range_node;
+ const char *devname = "dummy";

/* If we don't have the debugfs root yet, postpone init */
if (!regmap_debugfs_root) {
@@ -491,12 +492,15 @@ void regmap_debugfs_init(struct regmap *
INIT_LIST_HEAD(&map->debugfs_off_cache);
mutex_init(&map->cache_lock);

+ if (map->dev)
+ devname = dev_name(map->dev);
+
if (name) {
map->debugfs_name = kasprintf(GFP_KERNEL, "%s-%s",
- dev_name(map->dev), name);
+ devname, name);
name = map->debugfs_name;
} else {
- name = dev_name(map->dev);
+ name = devname;
}

map->debugfs = debugfs_create_dir(name, regmap_debugfs_root);

2014-10-28 08:02:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 027/146] HID: wacom: fix timeout on probe for some wacoms

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Benjamin Tissoires <[email protected]>

commit 8ffffd5212846b72f116f7a9572e83d580e25802 upstream.

Some Wacom tablets (at least the ISDv4 found in the Lenovo X230) timeout
during probe while retrieving the input reports.
The only time this information is valuable is during the feature_mapping
stage, so we can ask for it there and discard the generic input reports
retrieval.

This gives a code path closer to the wacom.ko driver when it was in the
input subtree (not HID).

Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/wacom_sys.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

--- a/drivers/hid/wacom_sys.c
+++ b/drivers/hid/wacom_sys.c
@@ -106,12 +106,24 @@ static void wacom_feature_mapping(struct
{
struct wacom *wacom = hid_get_drvdata(hdev);
struct wacom_features *features = &wacom->wacom_wac.features;
+ u8 *data;
+ int ret;

switch (usage->hid) {
case HID_DG_CONTACTMAX:
/* leave touch_max as is if predefined */
- if (!features->touch_max)
- features->touch_max = field->value[0];
+ if (!features->touch_max) {
+ /* read manually */
+ data = kzalloc(2, GFP_KERNEL);
+ if (!data)
+ break;
+ data[0] = field->report->id;
+ ret = wacom_get_report(hdev, HID_FEATURE_REPORT,
+ data, 2, 0);
+ if (ret == 2)
+ features->touch_max = data[1];
+ kfree(data);
+ }
break;
}
}
@@ -1245,6 +1257,8 @@ static int wacom_probe(struct hid_device
if (!id->driver_data)
return -EINVAL;

+ hdev->quirks |= HID_QUIRK_NO_INIT_REPORTS;
+
wacom = kzalloc(sizeof(struct wacom), GFP_KERNEL);
if (!wacom)
return -ENOMEM;

2014-10-28 08:02:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 025/146] spi: dw-mid: check that DMA was inited before exit

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Shevchenko <[email protected]>

commit fb57862ead652454ceeb659617404c5f13bc34b5 upstream.

If the driver was compiled with DMA support, but DMA channels weren't acquired
by some reason, mid_spi_dma_exit() will crash the kernel.

Fixes: 7063c0d942a1 (spi/dw_spi: add DMA support)
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/spi/spi-dw-mid.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/spi/spi-dw-mid.c
+++ b/drivers/spi/spi-dw-mid.c
@@ -89,6 +89,8 @@ err_exit:

static void mid_spi_dma_exit(struct dw_spi *dws)
{
+ if (!dws->dma_inited)
+ return;
dma_release_channel(dws->txchan);
dma_release_channel(dws->rxchan);
}

2014-10-28 08:04:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 009/146] Btrfs: cleanup error handling in build_backref_tree

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josef Bacik <[email protected]>

commit 75bfb9aff45e44625260f52a5fd581b92ace3e62 upstream.

When balance panics it tends to panic in the

BUG_ON(!upper->checked);

test, because it means it couldn't build the backref tree properly. This is
annoying to users and frankly a recoverable error, nothing in this function is
actually fatal since it is just an in-memory building of the backrefs for a
given bytenr. So go through and change all the BUG_ON()'s to ASSERT()'s, and
fix the BUG_ON(!upper->checked) thing to just return an error.

This patch also fixes the error handling so it tears down the work we've done
properly. This code was horribly broken since we always just panic'ed instead
of actually erroring out, so it needed to be completely re-worked. With this
patch my broken image no longer panics when I mount it. Thanks,

Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/relocation.c | 88 +++++++++++++++++++++++++++++++++-----------------
1 file changed, 59 insertions(+), 29 deletions(-)

--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -736,7 +736,8 @@ again:
err = ret;
goto out;
}
- BUG_ON(!ret || !path1->slots[0]);
+ ASSERT(ret);
+ ASSERT(path1->slots[0]);

path1->slots[0]--;

@@ -746,10 +747,10 @@ again:
* the backref was added previously when processing
* backref of type BTRFS_TREE_BLOCK_REF_KEY
*/
- BUG_ON(!list_is_singular(&cur->upper));
+ ASSERT(list_is_singular(&cur->upper));
edge = list_entry(cur->upper.next, struct backref_edge,
list[LOWER]);
- BUG_ON(!list_empty(&edge->list[UPPER]));
+ ASSERT(list_empty(&edge->list[UPPER]));
exist = edge->node[UPPER];
/*
* add the upper level block to pending list if we need
@@ -831,7 +832,7 @@ again:
cur->cowonly = 1;
}
#else
- BUG_ON(key.type == BTRFS_EXTENT_REF_V0_KEY);
+ ASSERT(key.type != BTRFS_EXTENT_REF_V0_KEY);
if (key.type == BTRFS_SHARED_BLOCK_REF_KEY) {
#endif
if (key.objectid == key.offset) {
@@ -840,7 +841,7 @@ again:
* backref of this type.
*/
root = find_reloc_root(rc, cur->bytenr);
- BUG_ON(!root);
+ ASSERT(root);
cur->root = root;
break;
}
@@ -868,7 +869,7 @@ again:
} else {
upper = rb_entry(rb_node, struct backref_node,
rb_node);
- BUG_ON(!upper->checked);
+ ASSERT(upper->checked);
INIT_LIST_HEAD(&edge->list[UPPER]);
}
list_add_tail(&edge->list[LOWER], &cur->upper);
@@ -892,7 +893,7 @@ again:

if (btrfs_root_level(&root->root_item) == cur->level) {
/* tree root */
- BUG_ON(btrfs_root_bytenr(&root->root_item) !=
+ ASSERT(btrfs_root_bytenr(&root->root_item) ==
cur->bytenr);
if (should_ignore_root(root))
list_add(&cur->list, &useless);
@@ -927,7 +928,7 @@ again:
need_check = true;
for (; level < BTRFS_MAX_LEVEL; level++) {
if (!path2->nodes[level]) {
- BUG_ON(btrfs_root_bytenr(&root->root_item) !=
+ ASSERT(btrfs_root_bytenr(&root->root_item) ==
lower->bytenr);
if (should_ignore_root(root))
list_add(&lower->list, &useless);
@@ -982,7 +983,7 @@ again:
} else {
upper = rb_entry(rb_node, struct backref_node,
rb_node);
- BUG_ON(!upper->checked);
+ ASSERT(upper->checked);
INIT_LIST_HEAD(&edge->list[UPPER]);
if (!upper->owner)
upper->owner = btrfs_header_owner(eb);
@@ -1026,7 +1027,7 @@ next:
* everything goes well, connect backref nodes and insert backref nodes
* into the cache.
*/
- BUG_ON(!node->checked);
+ ASSERT(node->checked);
cowonly = node->cowonly;
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, node->bytenr,
@@ -1062,8 +1063,21 @@ next:
continue;
}

- BUG_ON(!upper->checked);
- BUG_ON(cowonly != upper->cowonly);
+ if (!upper->checked) {
+ /*
+ * Still want to blow up for developers since this is a
+ * logic bug.
+ */
+ ASSERT(0);
+ err = -EINVAL;
+ goto out;
+ }
+ if (cowonly != upper->cowonly) {
+ ASSERT(0);
+ err = -EINVAL;
+ goto out;
+ }
+
if (!cowonly) {
rb_node = tree_insert(&cache->rb_root, upper->bytenr,
&upper->rb_node);
@@ -1086,7 +1100,7 @@ next:
while (!list_empty(&useless)) {
upper = list_entry(useless.next, struct backref_node, list);
list_del_init(&upper->list);
- BUG_ON(!list_empty(&upper->upper));
+ ASSERT(list_empty(&upper->upper));
if (upper == node)
node = NULL;
if (upper->lowest) {
@@ -1119,29 +1133,45 @@ out:
if (err) {
while (!list_empty(&useless)) {
lower = list_entry(useless.next,
- struct backref_node, upper);
- list_del_init(&lower->upper);
+ struct backref_node, list);
+ list_del_init(&lower->list);
}
- upper = node;
- INIT_LIST_HEAD(&list);
- while (upper) {
- if (RB_EMPTY_NODE(&upper->rb_node)) {
- list_splice_tail(&upper->upper, &list);
- free_backref_node(cache, upper);
- }
-
- if (list_empty(&list))
- break;
-
- edge = list_entry(list.next, struct backref_edge,
- list[LOWER]);
+ while (!list_empty(&list)) {
+ edge = list_first_entry(&list, struct backref_edge,
+ list[UPPER]);
+ list_del(&edge->list[UPPER]);
list_del(&edge->list[LOWER]);
+ lower = edge->node[LOWER];
upper = edge->node[UPPER];
free_backref_edge(cache, edge);
+
+ /*
+ * Lower is no longer linked to any upper backref nodes
+ * and isn't in the cache, we can free it ourselves.
+ */
+ if (list_empty(&lower->upper) &&
+ RB_EMPTY_NODE(&lower->rb_node))
+ list_add(&lower->list, &useless);
+
+ if (!RB_EMPTY_NODE(&upper->rb_node))
+ continue;
+
+ /* Add this guy's upper edges to the list to proces */
+ list_for_each_entry(edge, &upper->upper, list[LOWER])
+ list_add_tail(&edge->list[UPPER], &list);
+ if (list_empty(&upper->upper))
+ list_add(&upper->list, &useless);
+ }
+
+ while (!list_empty(&useless)) {
+ lower = list_entry(useless.next,
+ struct backref_node, list);
+ list_del_init(&lower->list);
+ free_backref_node(cache, lower);
}
return ERR_PTR(err);
}
- BUG_ON(node && node->detached);
+ ASSERT(!node || !node->detached);
return node;
}


2014-10-28 08:05:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 007/146] Btrfs: dont do async reclaim during log replay

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josef Bacik <[email protected]>

commit f6acfd50110b335c7af636cf1fc8e55319cae5fc upstream.

Trying to reproduce a log enospc bug I hit a panic in the async reclaim code
during log replay. This is because we use fs_info->fs_root as our root for
shrinking and such. Technically we can use whatever root we want, but let's
just not allow async reclaim while we're doing log replay. Thanks,

Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/extent-tree.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4502,7 +4502,13 @@ again:
space_info->flush = 1;
} else if (!ret && space_info->flags & BTRFS_BLOCK_GROUP_METADATA) {
used += orig_bytes;
- if (need_do_async_reclaim(space_info, root->fs_info, used) &&
+ /*
+ * We will do the space reservation dance during log replay,
+ * which means we won't have fs_info->fs_root set, so don't do
+ * the async reclaim as we will panic.
+ */
+ if (!root->fs_info->log_root_recovering &&
+ need_do_async_reclaim(space_info, root->fs_info, used) &&
!work_busy(&root->fs_info->async_reclaim_work))
queue_work(system_unbound_wq,
&root->fs_info->async_reclaim_work);

2014-10-28 08:06:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 004/146] Btrfs: add missing compression property remove in btrfs_ioctl_setflags

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Filipe Manana <[email protected]>

commit 78a017a2c92df9b571db0a55a016280f9019c65e upstream.

The behaviour of a 'chattr -c' consists of getting the current flags,
clearing the FS_COMPR_FL bit and then sending the result to the set
flags ioctl - this means the bit FS_NOCOMP_FL isn't set in the flags
passed to the ioctl. This results in the compression property not being
cleared from the inode - it was cleared only if the bit FS_NOCOMP_FL
was set in the received flags.

Reproducer:

$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt && cd /mnt
$ mkdir a
$ chattr +c a
$ touch a/file
$ lsattr a/file
--------c------- a/file
$ chattr -c a
$ touch a/file2
$ lsattr a/file2
--------c------- a/file2
$ lsattr -d a
---------------- a

Reported-by: Andreas Schneider <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/ioctl.c | 3 +++
1 file changed, 3 insertions(+)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -332,6 +332,9 @@ static int btrfs_ioctl_setflags(struct f
goto out_drop;

} else {
+ ret = btrfs_set_prop(inode, "btrfs.compression", NULL, 0, 0);
+ if (ret && ret != -ENODATA)
+ goto out_drop;
ip->flags &= ~(BTRFS_INODE_COMPRESS | BTRFS_INODE_NOCOMPRESS);
}


2014-10-28 08:06:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 002/146] btrfs: dont go readonly on existing qgroup items

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mark Fasheh <[email protected]>

commit 0b4699dcb65c2cff793210b07f40b98c2d423a43 upstream.

btrfs_drop_snapshot() leaves subvolume qgroup items on disk after
completion. This can cause problems with snapshot creation. If a new
snapshot tries to claim the deleted subvolumes id, btrfs will get -EEXIST
from add_qgroup_item() and go read-only. The following commands will
reproduce this problem (assume btrfs is on /dev/sda and is mounted at
/btrfs)

mkfs.btrfs -f /dev/sda
mount -t btrfs /dev/sda /btrfs/
btrfs quota enable /btrfs/
btrfs su sna /btrfs/ /btrfs/snap
btrfs su de /btrfs/snap
sleep 45
umount /btrfs/
mount -t btrfs /dev/sda /btrfs/

We can fix this by catching -EEXIST in add_qgroup_item() and
initializing the existing items. We have the problem of orphaned
relation items being on disk from an old snapshot but that is outside
the scope of this patch.

Signed-off-by: Mark Fasheh <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/qgroup.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -551,9 +551,15 @@ static int add_qgroup_item(struct btrfs_
key.type = BTRFS_QGROUP_INFO_KEY;
key.offset = qgroupid;

+ /*
+ * Avoid a transaction abort by catching -EEXIST here. In that
+ * case, we proceed by re-initializing the existing structure
+ * on disk.
+ */
+
ret = btrfs_insert_empty_item(trans, quota_root, path, &key,
sizeof(*qgroup_info));
- if (ret)
+ if (ret && ret != -EEXIST)
goto out;

leaf = path->nodes[0];
@@ -572,7 +578,7 @@ static int add_qgroup_item(struct btrfs_
key.type = BTRFS_QGROUP_LIMIT_KEY;
ret = btrfs_insert_empty_item(trans, quota_root, path, &key,
sizeof(*qgroup_limit));
- if (ret)
+ if (ret && ret != -EEXIST)
goto out;

leaf = path->nodes[0];

2014-10-28 03:36:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 011/146] btrfs: Fix the wrong condition judgment about subset extent map

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Qu Wenruo <[email protected]>

commit 32be3a1ac6d09576c57063c6c350ca36eaebdbd3 upstream.

Previous commit: btrfs: Fix and enhance merge_extent_mapping() to insert
best fitted extent map
is using wrong condition to judgement whether the range is a subset of a
existing extent map.

This may cause bug in btrfs no-holes mode.

This patch will correct the judgment and fix the bug.

Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6528,7 +6528,7 @@ insert:
* extent causing the -EEXIST.
*/
if (start >= extent_map_end(existing) ||
- start + len <= existing->start) {
+ start <= existing->start) {
/*
* The existing extent map is the one nearest to
* the [start, start + len) range which overlaps

2014-10-28 08:08:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 013/146] Revert "Btrfs: race free update of commit root for ro snapshots"

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Chris Mason <[email protected]>

commit d37973082b453ba6b89ec07eb7b84305895d35e1 upstream.

This reverts commit 9c3b306e1c9e6be4be09e99a8fe2227d1005effc.

Switching only one commit root during a transaction is wrong because it
leads the fs into an inconsistent state. All commit roots should be
switched at once, at transaction commit time, otherwise backref walking
can often miss important references that were only accessible through
the old commit root. Plus, the root item for the snapshot's root wasn't
getting updated and preventing the next transaction commit to do it.

This made several users get into random corruption issues after creation
of readonly snapshots.

A regression test for xfstests will follow soon.

Cc: [email protected] # 3.17
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/inode.c | 36 ------------------------------------
fs/btrfs/ioctl.c | 33 +++++++++++++++++++++++++++++++++
2 files changed, 33 insertions(+), 36 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5203,42 +5203,6 @@ struct inode *btrfs_lookup_dentry(struct
iput(inode);
inode = ERR_PTR(ret);
}
- /*
- * If orphan cleanup did remove any orphans, it means the tree
- * was modified and therefore the commit root is not the same as
- * the current root anymore. This is a problem, because send
- * uses the commit root and therefore can see inode items that
- * don't exist in the current root anymore, and for example make
- * calls to btrfs_iget, which will do tree lookups based on the
- * current root and not on the commit root. Those lookups will
- * fail, returning a -ESTALE error, and making send fail with
- * that error. So make sure a send does not see any orphans we
- * have just removed, and that it will see the same inodes
- * regardless of whether a transaction commit happened before
- * it started (meaning that the commit root will be the same as
- * the current root) or not.
- */
- if (sub_root->node != sub_root->commit_root) {
- u64 sub_flags = btrfs_root_flags(&sub_root->root_item);
-
- if (sub_flags & BTRFS_ROOT_SUBVOL_RDONLY) {
- struct extent_buffer *eb;
-
- /*
- * Assert we can't have races between dentry
- * lookup called through the snapshot creation
- * ioctl and the VFS.
- */
- ASSERT(mutex_is_locked(&dir->i_mutex));
-
- down_write(&root->fs_info->commit_root_sem);
- eb = sub_root->commit_root;
- sub_root->commit_root =
- btrfs_root_node(sub_root);
- up_write(&root->fs_info->commit_root_sem);
- free_extent_buffer(eb);
- }
- }
}

return inode;
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -714,6 +714,39 @@ static int create_snapshot(struct btrfs_
if (ret)
goto fail;

+ ret = btrfs_orphan_cleanup(pending_snapshot->snap);
+ if (ret)
+ goto fail;
+
+ /*
+ * If orphan cleanup did remove any orphans, it means the tree was
+ * modified and therefore the commit root is not the same as the
+ * current root anymore. This is a problem, because send uses the
+ * commit root and therefore can see inode items that don't exist
+ * in the current root anymore, and for example make calls to
+ * btrfs_iget, which will do tree lookups based on the current root
+ * and not on the commit root. Those lookups will fail, returning a
+ * -ESTALE error, and making send fail with that error. So make sure
+ * a send does not see any orphans we have just removed, and that it
+ * will see the same inodes regardless of whether a transaction
+ * commit happened before it started (meaning that the commit root
+ * will be the same as the current root) or not.
+ */
+ if (readonly && pending_snapshot->snap->node !=
+ pending_snapshot->snap->commit_root) {
+ trans = btrfs_join_transaction(pending_snapshot->snap);
+ if (IS_ERR(trans) && PTR_ERR(trans) != -ENOENT) {
+ ret = PTR_ERR(trans);
+ goto fail;
+ }
+ if (!IS_ERR(trans)) {
+ ret = btrfs_commit_transaction(trans,
+ pending_snapshot->snap);
+ if (ret)
+ goto fail;
+ }
+ }
+
inode = btrfs_lookup_dentry(dentry->d_parent->d_inode, dentry);
if (IS_ERR(inode)) {
ret = PTR_ERR(inode);

2014-10-28 08:08:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.17 010/146] Btrfs: fix build_backref_tree issue with multiple shared blocks

3.17-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josef Bacik <[email protected]>

commit bbe9051441effce51c9a533d2c56440df64db2d7 upstream.

Marc Merlin sent me a broken fs image months ago where it would blow up in the
upper->checked BUG_ON() in build_backref_tree. This is because we had a
scenario like this

block a -- level 4 (not shared)
|
block b -- level 3 (reloc block, shared)
|
block c -- level 2 (not shared)
|
block d -- level 1 (shared)
|
block e -- level 0 (shared)

We go to build a backref tree for block e, we notice block d is shared and add
it to the list of blocks to lookup it's backrefs for. Now when we loop around
we will check edges for the block, so we will see we looked up block c last
time. So we lookup block d and then see that the block that points to it is
block c and we can just skip that edge since we've already been up this path.
The problem is because we clear need_check when we see block d (as it is shared)
we never add block b as needing to be checked. And because block c is in our
path already we bail out before we walk up to block b and add it to the backref
check list.

To fix this we need to reset need_check if we trip over a block that doesn't
need to be checked. This will make sure that any subsequent blocks in the path
as we're walking up afterwards are added to the list to be processed. With this
patch I can now mount Marc's fs image and it'll complete the balance without
panicing. Thanks,

Reported-by: Marc MERLIN <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/relocation.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -978,8 +978,11 @@ again:
need_check = false;
list_add_tail(&edge->list[UPPER],
&list);
- } else
+ } else {
+ if (upper->checked)
+ need_check = true;
INIT_LIST_HEAD(&edge->list[UPPER]);
+ }
} else {
upper = rb_entry(rb_node, struct backref_node,
rb_node);

2014-10-28 15:15:11

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 3.17 000/146] 3.17.2-stable review

On Tue, Oct 28, 2014 at 11:32:22AM +0800, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.17.2 release.
> There are 146 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Oct 30 03:32:38 UTC 2014.
> Anything received after that time might be too late.
>
Build results:
total: 133 pass: 133 fail: 0
Qemu test results:
total: 30 pass: 30 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

2014-10-28 16:15:13

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 3.17 000/146] 3.17.2-stable review

On 10/27/2014 09:32 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.17.2 release.
> There are 146 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Oct 30 03:32:38 UTC 2014.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.17.2-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No regressions in dmesg.

-- Shuah


--
Shuah Khan
Sr. Linux Kernel Developer
Samsung Research America (Silicon Valley)
[email protected] | (970) 217-8978

2014-10-28 23:41:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.17 000/146] 3.17.2-stable review

On Tue, Oct 28, 2014 at 10:15:08AM -0600, Shuah Khan wrote:
> On 10/27/2014 09:32 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.17.2 release.
> > There are 146 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu Oct 30 03:32:38 UTC 2014.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.17.2-rc1.gz
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
>
> Compiled and booted on my test system. No regressions in dmesg.

Thanks for testing all of these and letting me know.

greg k-h

2014-10-28 23:41:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.17 000/146] 3.17.2-stable review

On Tue, Oct 28, 2014 at 08:15:05AM -0700, Guenter Roeck wrote:
> On Tue, Oct 28, 2014 at 11:32:22AM +0800, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.17.2 release.
> > There are 146 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu Oct 30 03:32:38 UTC 2014.
> > Anything received after that time might be too late.
> >
> Build results:
> total: 133 pass: 133 fail: 0
> Qemu test results:
> total: 30 pass: 30 fail: 0
>
> Details are available at http://server.roeck-us.net:8010/builders.

Thanks for letting me know about all of these.

greg k-h