2015-07-01 18:51:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 00/34] 3.14.47-stable review

This is the start of the stable review cycle for the 3.14.47 release.
There are 34 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri Jul 3 18:39:45 UTC 2015.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.47-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 3.14.47-rc1

Christoffer Dall <[email protected]>
arm/arm64: KVM: Don't allow creating VCPUs after vgic_initialized

Christoffer Dall <[email protected]>
arm/arm64: KVM: Introduce stage2_unmap_vm

Christoffer Dall <[email protected]>
arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu

Christoffer Dall <[email protected]>
arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option

Christoffer Dall <[email protected]>
arm/arm64: KVM: Don't clear the VCPU_POWER_OFF flag

Ard Biesheuvel <[email protected]>
arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn()

Geoff Levand <[email protected]>
arm64/kvm: Fix assembler compatibility of macros

Christoffer Dall <[email protected]>
arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create()

Mark Rutland <[email protected]>
arm64: KVM: fix unmapping with 48-bit VAs

Steve Capper <[email protected]>
arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort

Christoffer Dall <[email protected]>
arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE

Ard Biesheuvel <[email protected]>
arm/arm64: KVM: fix potential NULL dereference in user_mem_abort()

Vladimir Murzin <[email protected]>
arm: kvm: fix CPU hotplug

Joel Schopp <[email protected]>
arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc

Christoffer Dall <[email protected]>
arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset

Marc Zyngier <[email protected]>
KVM: ARM: vgic: plug irq injection race

Ard Biesheuvel <[email protected]>
ARM/arm64: KVM: fix use of WnR bit in kvm_is_write_fault()

Greg Ungerer <[email protected]>
bus: mvebu: pass the coherency availability information at init time

Bandan Das <[email protected]>
KVM: nSVM: Check for NRIPS support before updating control field

Sebastien Szymanski <[email protected]>
ARM: clk-imx6q: refine sata's parent

Ben Hutchings <[email protected]>
splice: Apply generic position and size checks to each write

Or Gerlitz <[email protected]>
net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN

Filipe Manana <[email protected]>
Btrfs: make xattr replace operations atomic

Quentin Casasnovas <[email protected]>
x86/microcode/intel: Guard against stack overflow in the loader

Tomas Henzl <[email protected]>
hpsa: add missing pci_set_master in kdump path

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: allow to change chain policy without hook if it exists

Pablo Neira Ayuso <[email protected]>
netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set

Ian Wilson <[email protected]>
netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()

Tomas Henzl <[email protected]>
hpsa: refine the pci enable/disable handling

Jim Snow <[email protected]>
sb_edac: Fix erroneous bytes->gigabytes conversion

Chen Gang <[email protected]>
netfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings

Konrad Rzeszutek Wilk <[email protected]>
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected

Eugene Shatokhin <[email protected]>
kprobes/x86: Return correct length in __copy_instruction()

Marek Szyprowski <[email protected]>
arm64: dma-mapping: always clear allocated buffers


-------------

Diffstat:

Documentation/virtual/kvm/api.txt | 3 +-
Makefile | 4 +-
arch/arm/include/asm/kvm_emulate.h | 5 +
arch/arm/include/asm/kvm_mmu.h | 12 +--
arch/arm/kvm/arm.c | 25 ++++-
arch/arm/kvm/guest.c | 1 -
arch/arm/kvm/mmu.c | 104 ++++++++++++++++++-
arch/arm/mach-dove/board-dt.c | 2 +-
arch/arm/mach-imx/clk-imx6q.c | 2 +-
arch/arm/mach-kirkwood/board-dt.c | 2 +-
arch/arm/mach-mvebu/armada-370-xp.c | 2 +-
arch/arm/mach-mvebu/coherency.c | 15 +++
arch/arm/mach-mvebu/coherency.h | 1 +
arch/arm64/include/asm/kvm_arm.h | 32 ++++--
arch/arm64/include/asm/kvm_emulate.h | 5 +
arch/arm64/include/asm/kvm_mmu.h | 19 +---
arch/arm64/kvm/guest.c | 1 -
arch/arm64/mm/dma-mapping.c | 3 +-
arch/x86/Kconfig | 2 +-
arch/x86/kernel/cpu/microcode/intel_early.c | 2 +-
arch/x86/kernel/kprobes/core.c | 7 +-
arch/x86/kvm/svm.c | 8 +-
drivers/bus/mvebu-mbus.c | 11 +-
drivers/edac/sb_edac.c | 38 +++----
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 7 +-
drivers/scsi/hpsa.c | 42 +++++---
fs/btrfs/ctree.c | 2 +-
fs/btrfs/ctree.h | 5 +
fs/btrfs/dir-item.c | 10 +-
fs/btrfs/xattr.c | 150 +++++++++++++++++-----------
fs/ocfs2/file.c | 8 +-
fs/splice.c | 8 +-
include/linux/mbus.h | 2 +-
net/netfilter/nf_tables_api.c | 5 +-
net/netfilter/nfnetlink_cthelper.c | 7 +-
net/netfilter/nft_compat.c | 6 ++
virt/kvm/arm/vgic.c | 15 +--
37 files changed, 389 insertions(+), 184 deletions(-)


2015-07-01 18:50:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 01/34] arm64: dma-mapping: always clear allocated buffers

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marek Szyprowski <[email protected]>

commit 6829e274a623187c24f7cfc0e3d35f25d087fcc5 upstream.

Buffers allocated by dma_alloc_coherent() are always zeroed on Alpha,
ARM (32bit), MIPS, PowerPC, x86/x86_64 and probably other architectures.
It turned out that some drivers rely on this 'feature'. Allocated buffer
might be also exposed to userspace with dma_mmap() call, so clearing it
is desired from security point of view to avoid exposing random memory
to userspace. This patch unifies dma_alloc_coherent() behavior on ARM64
architecture with other implementations by unconditionally zeroing
allocated buffer.

Signed-off-by: Marek Szyprowski <[email protected]>
[will: ported to 3.14.y]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/mm/dma-mapping.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -54,8 +54,7 @@ static void *arm64_swiotlb_alloc_coheren

*dma_handle = phys_to_dma(dev, page_to_phys(page));
addr = page_address(page);
- if (flags & __GFP_ZERO)
- memset(addr, 0, size);
+ memset(addr, 0, size);
return addr;
} else {
return swiotlb_alloc_coherent(dev, size, dma_handle, flags);

2015-07-01 18:50:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 02/34] kprobes/x86: Return correct length in __copy_instruction()

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eugene Shatokhin <[email protected]>

commit c80e5c0c23ce2282476fdc64c4b5e3d3a40723fd upstream.

On x86-64, __copy_instruction() always returns 0 (error) if the
instruction uses %rip-relative addressing. This is because
kernel_insn_init() is called the second time for 'insn' instance
in such cases and sets all its fields to 0.

Because of this, trying to place a kprobe on such instruction
will fail, register_kprobe() will return -EINVAL.

This patch fixes the problem.

Signed-off-by: Eugene Shatokhin <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/kprobes/core.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -326,13 +326,16 @@ int __kprobes __copy_instruction(u8 *des
{
struct insn insn;
kprobe_opcode_t buf[MAX_INSN_SIZE];
+ int length;

kernel_insn_init(&insn, (void *)recover_probed_instruction(buf, (unsigned long)src));
insn_get_length(&insn);
+ length = insn.length;
+
/* Another subsystem puts a breakpoint, failed to recover */
if (insn.opcode.bytes[0] == BREAKPOINT_INSTRUCTION)
return 0;
- memcpy(dest, insn.kaddr, insn.length);
+ memcpy(dest, insn.kaddr, length);

#ifdef CONFIG_X86_64
if (insn_rip_relative(&insn)) {
@@ -362,7 +365,7 @@ int __kprobes __copy_instruction(u8 *des
*(s32 *) disp = (s32) newdisp;
}
#endif
- return insn.length;
+ return length;
}

static int __kprobes arch_copy_kprobe(struct kprobe *p)

2015-07-01 18:50:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 03/34] config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Konrad Rzeszutek Wilk <[email protected]>

commit a6dfa128ce5c414ab46b1d690f7a1b8decb8526d upstream.

A huge amount of NIC drivers use the DMA API, however if
compiled under 32-bit an very important part of the DMA API can
be ommitted leading to the drivers not working at all
(especially if used with 'swiotlb=force iommu=soft').

As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
the dma "mapping" and dma_unmap_addr() to get the "mapping"
value. On most of the platforms this is a no-op, but ... with
"iommu=soft and swiotlb=force" this house keeping is required,
... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
instead of the DMA address."

As such enable this even when using 32-bit kernels.

Reported-by: Ian Jackson <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: David S. Miller <[email protected]>
Acked-by: Prashant Sreedharan <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michael Chan <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -160,7 +160,7 @@ config SBUS

config NEED_DMA_MAP_STATE
def_bool y
- depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG
+ depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB

config NEED_SG_DMA_LENGTH
def_bool y

2015-07-01 18:49:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 05/34] sb_edac: Fix erroneous bytes->gigabytes conversion

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jim Snow <[email protected]>

commit 8c009100295597f23978c224aec5751a365bc965 upstream.

Signed-off-by: Jim Snow <[email protected]>
Signed-off-by: Lukasz Anaczkowski <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
[ vlee: Backported to 3.14. Adjusted context. ]
Signed-off-by: Vinson Lee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/edac/sb_edac.c | 38 ++++++++++++++++++++------------------
1 file changed, 20 insertions(+), 18 deletions(-)

--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -765,7 +765,7 @@ static void get_memory_layout(const stru
u32 reg;
u64 limit, prv = 0;
u64 tmp_mb;
- u32 mb, kb;
+ u32 gb, mb;
u32 rir_way;

/*
@@ -775,15 +775,17 @@ static void get_memory_layout(const stru
pvt->tolm = pvt->info.get_tolm(pvt);
tmp_mb = (1 + pvt->tolm) >> 20;

- mb = div_u64_rem(tmp_mb, 1000, &kb);
- edac_dbg(0, "TOLM: %u.%03u GB (0x%016Lx)\n", mb, kb, (u64)pvt->tolm);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
+ edac_dbg(0, "TOLM: %u.%03u GB (0x%016Lx)\n",
+ gb, (mb*1000)/1024, (u64)pvt->tolm);

/* Address range is already 45:25 */
pvt->tohm = pvt->info.get_tohm(pvt);
tmp_mb = (1 + pvt->tohm) >> 20;

- mb = div_u64_rem(tmp_mb, 1000, &kb);
- edac_dbg(0, "TOHM: %u.%03u GB (0x%016Lx)\n", mb, kb, (u64)pvt->tohm);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
+ edac_dbg(0, "TOHM: %u.%03u GB (0x%016Lx)\n",
+ gb, (mb*1000)/1024, (u64)pvt->tohm);

/*
* Step 2) Get SAD range and SAD Interleave list
@@ -805,11 +807,11 @@ static void get_memory_layout(const stru
break;

tmp_mb = (limit + 1) >> 20;
- mb = div_u64_rem(tmp_mb, 1000, &kb);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
edac_dbg(0, "SAD#%d %s up to %u.%03u GB (0x%016Lx) Interleave: %s reg=0x%08x\n",
n_sads,
get_dram_attr(reg),
- mb, kb,
+ gb, (mb*1000)/1024,
((u64)tmp_mb) << 20L,
INTERLEAVE_MODE(reg) ? "8:6" : "[8:6]XOR[18:16]",
reg);
@@ -840,9 +842,9 @@ static void get_memory_layout(const stru
break;
tmp_mb = (limit + 1) >> 20;

- mb = div_u64_rem(tmp_mb, 1000, &kb);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
edac_dbg(0, "TAD#%d: up to %u.%03u GB (0x%016Lx), socket interleave %d, memory interleave %d, TGT: %d, %d, %d, %d, reg=0x%08x\n",
- n_tads, mb, kb,
+ n_tads, gb, (mb*1000)/1024,
((u64)tmp_mb) << 20L,
(u32)TAD_SOCK(reg),
(u32)TAD_CH(reg),
@@ -865,10 +867,10 @@ static void get_memory_layout(const stru
tad_ch_nilv_offset[j],
&reg);
tmp_mb = TAD_OFFSET(reg) >> 20;
- mb = div_u64_rem(tmp_mb, 1000, &kb);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
edac_dbg(0, "TAD CH#%d, offset #%d: %u.%03u GB (0x%016Lx), reg=0x%08x\n",
i, j,
- mb, kb,
+ gb, (mb*1000)/1024,
((u64)tmp_mb) << 20L,
reg);
}
@@ -890,10 +892,10 @@ static void get_memory_layout(const stru

tmp_mb = RIR_LIMIT(reg) >> 20;
rir_way = 1 << RIR_WAY(reg);
- mb = div_u64_rem(tmp_mb, 1000, &kb);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
edac_dbg(0, "CH#%d RIR#%d, limit: %u.%03u GB (0x%016Lx), way: %d, reg=0x%08x\n",
i, j,
- mb, kb,
+ gb, (mb*1000)/1024,
((u64)tmp_mb) << 20L,
rir_way,
reg);
@@ -904,10 +906,10 @@ static void get_memory_layout(const stru
&reg);
tmp_mb = RIR_OFFSET(reg) << 6;

- mb = div_u64_rem(tmp_mb, 1000, &kb);
+ gb = div_u64_rem(tmp_mb, 1024, &mb);
edac_dbg(0, "CH#%d RIR#%d INTL#%d, offset %u.%03u GB (0x%016Lx), tgt: %d, reg=0x%08x\n",
i, j, k,
- mb, kb,
+ gb, (mb*1000)/1024,
((u64)tmp_mb) << 20L,
(u32)RIR_RNK_TGT(reg),
reg);
@@ -945,7 +947,7 @@ static int get_memory_error_data(struct
u8 ch_way, sck_way, pkg, sad_ha = 0;
u32 tad_offset;
u32 rir_way;
- u32 mb, kb;
+ u32 mb, gb;
u64 ch_addr, offset, limit = 0, prv = 0;


@@ -1183,10 +1185,10 @@ static int get_memory_error_data(struct
continue;

limit = RIR_LIMIT(reg);
- mb = div_u64_rem(limit >> 20, 1000, &kb);
+ gb = div_u64_rem(limit >> 20, 1024, &mb);
edac_dbg(0, "RIR#%d, limit: %u.%03u GB (0x%016Lx), way: %d\n",
n_rir,
- mb, kb,
+ gb, (mb*1000)/1024,
limit,
1 << RIR_WAY(reg));
if (ch_addr <= limit)

2015-07-01 18:49:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 06/34] hpsa: refine the pci enable/disable handling

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tomas Henzl <[email protected]>

commit 132aa220b45d60e9b20def1e9d8be9422eed9616 upstream.

When a second(kdump) kernel starts and the hard reset method is used
the driver calls pci_disable_device without previously enabling it,
so the kernel shows a warning -
[ 16.876248] WARNING: at drivers/pci/pci.c:1431 pci_disable_device+0x84/0x90()
[ 16.882686] Device hpsa
disabling already-disabled device
...
This patch fixes it, in addition to this I tried to balance also some other pairs
of enable/disable device in the driver.
Unfortunately I wasn't able to verify the functionality for the case of a sw reset,
because of a lack of proper hw.

Signed-off-by: Tomas Henzl <[email protected]>
Reviewed-by: Stephen M. Cameron <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/hpsa.c | 42 ++++++++++++++++++++++++++++--------------
1 file changed, 28 insertions(+), 14 deletions(-)

--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -3984,10 +3984,6 @@ static int hpsa_kdump_hard_reset_control

/* Save the PCI command register */
pci_read_config_word(pdev, 4, &command_register);
- /* Turn the board off. This is so that later pci_restore_state()
- * won't turn the board on before the rest of config space is ready.
- */
- pci_disable_device(pdev);
pci_save_state(pdev);

/* find the first memory BAR, so we can find the cfg table */
@@ -4035,11 +4031,6 @@ static int hpsa_kdump_hard_reset_control
goto unmap_cfgtable;

pci_restore_state(pdev);
- rc = pci_enable_device(pdev);
- if (rc) {
- dev_warn(&pdev->dev, "failed to enable device.\n");
- goto unmap_cfgtable;
- }
pci_write_config_word(pdev, 4, command_register);

/* Some devices (notably the HP Smart Array 5i Controller)
@@ -4525,6 +4516,23 @@ static int hpsa_init_reset_devices(struc
if (!reset_devices)
return 0;

+ /* kdump kernel is loading, we don't know in which state is
+ * the pci interface. The dev->enable_cnt is equal zero
+ * so we call enable+disable, wait a while and switch it on.
+ */
+ rc = pci_enable_device(pdev);
+ if (rc) {
+ dev_warn(&pdev->dev, "Failed to enable PCI device\n");
+ return -ENODEV;
+ }
+ pci_disable_device(pdev);
+ msleep(260); /* a randomly chosen number */
+ rc = pci_enable_device(pdev);
+ if (rc) {
+ dev_warn(&pdev->dev, "failed to enable device.\n");
+ return -ENODEV;
+ }
+
/* Reset the controller with a PCI power-cycle or via doorbell */
rc = hpsa_kdump_hard_reset_controller(pdev);

@@ -4533,10 +4541,11 @@ static int hpsa_init_reset_devices(struc
* "performant mode". Or, it might be 640x, which can't reset
* due to concerns about shared bbwc between 6402/6404 pair.
*/
- if (rc == -ENOTSUPP)
- return rc; /* just try to do the kdump anyhow. */
- if (rc)
- return -ENODEV;
+ if (rc) {
+ if (rc != -ENOTSUPP) /* just try to do the kdump anyhow. */
+ rc = -ENODEV;
+ goto out_disable;
+ }

/* Now try to get the controller to respond to a no-op */
dev_warn(&pdev->dev, "Waiting for controller to respond to no-op\n");
@@ -4547,7 +4556,11 @@ static int hpsa_init_reset_devices(struc
dev_warn(&pdev->dev, "no-op failed%s\n",
(i < 11 ? "; re-trying" : ""));
}
- return 0;
+
+out_disable:
+
+ pci_disable_device(pdev);
+ return rc;
}

static int hpsa_allocate_cmd_pool(struct ctlr_info *h)
@@ -4690,6 +4703,7 @@ static void hpsa_undo_allocations_after_
iounmap(h->transtable);
if (h->cfgtable)
iounmap(h->cfgtable);
+ pci_disable_device(h->pdev);
pci_release_regions(h->pdev);
kfree(h);
}

2015-07-01 18:50:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 07/34] netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ian Wilson <[email protected]>

commit 78146572b9cd20452da47951812f35b1ad4906be upstream.

nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(),
nfnl_cthelper_get() and nfnl_cthelper_del(). In each case they pass
a pointer to an nf_conntrack_tuple data structure local variable:

struct nf_conntrack_tuple tuple;
...
ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]);

The problem is that this local variable is not initialized, and
nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and
dst.protonum. This leaves all other fields with undefined values
based on whatever is on the stack:

tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);

The symptom observed was that when the rpc and tns helpers were added
then traffic to port 1536 was being sent to user-space.

Signed-off-by: Ian Wilson <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nfnetlink_cthelper.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -77,6 +77,9 @@ nfnl_cthelper_parse_tuple(struct nf_conn
if (!tb[NFCTH_TUPLE_L3PROTONUM] || !tb[NFCTH_TUPLE_L4PROTONUM])
return -EINVAL;

+ /* Not all fields are initialized so first zero the tuple */
+ memset(tuple, 0, sizeof(struct nf_conntrack_tuple));
+
tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);


2015-07-01 18:42:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 08/34] netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Pablo Neira Ayuso <[email protected]>

commit 749177ccc74f9c6d0f51bd78a15c652a2134aa11 upstream.

ip6tables extensions check for this flag to restrict match/target to a
given protocol. Without this flag set, SYNPROXY6 returns an error.

Signed-off-by: Pablo Neira Ayuso <[email protected]>
Acked-by: Patrick McHardy <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nft_compat.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -82,6 +82,9 @@ nft_target_set_tgchk_param(struct xt_tgc
entry->e4.ip.invflags = inv ? IPT_INV_PROTO : 0;
break;
case AF_INET6:
+ if (proto)
+ entry->e6.ipv6.flags |= IP6T_F_PROTO;
+
entry->e6.ipv6.proto = proto;
entry->e6.ipv6.invflags = inv ? IP6T_INV_PROTO : 0;
break;
@@ -313,6 +316,9 @@ nft_match_set_mtchk_param(struct xt_mtch
entry->e4.ip.invflags = inv ? IPT_INV_PROTO : 0;
break;
case AF_INET6:
+ if (proto)
+ entry->e6.ipv6.flags |= IP6T_F_PROTO;
+
entry->e6.ipv6.proto = proto;
entry->e6.ipv6.invflags = inv ? IP6T_INV_PROTO : 0;
break;

2015-07-01 18:50:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 09/34] netfilter: nf_tables: allow to change chain policy without hook if it exists

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Pablo Neira Ayuso <[email protected]>

commit d6b6cb1d3e6f78d55c2d4043d77d0d8def3f3b99 upstream.

If there's an existing base chain, we have to allow to change the
default policy without indicating the hook information.

However, if the chain doesn't exists, we have to enforce the presence of
the hook attribute.

Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nf_tables_api.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -855,7 +855,10 @@ static int nf_tables_newchain(struct soc

if (nla[NFTA_CHAIN_POLICY]) {
if ((chain != NULL &&
- !(chain->flags & NFT_BASE_CHAIN)) ||
+ !(chain->flags & NFT_BASE_CHAIN)))
+ return -EOPNOTSUPP;
+
+ if (chain == NULL &&
nla[NFTA_CHAIN_HOOK] == NULL)
return -EOPNOTSUPP;


2015-07-01 18:47:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 10/34] hpsa: add missing pci_set_master in kdump path

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tomas Henzl <[email protected]>

commit 859c75aba20264d87dd026bab0d0ca3bff385955 upstream.

Add a call to pci_set_master(...) missing in the previous
patch "hpsa: refine the pci enable/disable handling".
Found thanks to Rob Elliot.

Signed-off-by: Tomas Henzl <[email protected]>
Reviewed-by: Robert Elliott <[email protected]>
Tested-by: Robert Elliott <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/hpsa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -4532,7 +4532,7 @@ static int hpsa_init_reset_devices(struc
dev_warn(&pdev->dev, "failed to enable device.\n");
return -ENODEV;
}
-
+ pci_set_master(pdev);
/* Reset the controller with a PCI power-cycle or via doorbell */
rc = hpsa_kdump_hard_reset_controller(pdev);


2015-07-01 18:47:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 11/34] x86/microcode/intel: Guard against stack overflow in the loader

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Quentin Casasnovas <[email protected]>

commit f84598bd7c851f8b0bf8cd0d7c3be0d73c432ff4 upstream.

mc_saved_tmp is a static array allocated on the stack, we need to make
sure mc_saved_count stays within its bounds, otherwise we're overflowing
the stack in _save_mc(). A specially crafted microcode header could lead
to a kernel crash or potentially kernel execution.

Signed-off-by: Quentin Casasnovas <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Fenghua Yu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/microcode/intel_early.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/microcode/intel_early.c
+++ b/arch/x86/kernel/cpu/microcode/intel_early.c
@@ -321,7 +321,7 @@ get_matching_model_microcode(int cpu, un
unsigned int mc_saved_count = mc_saved_data->mc_saved_count;
int i;

- while (leftover) {
+ while (leftover && mc_saved_count < ARRAY_SIZE(mc_saved_tmp)) {
mc_header = (struct microcode_header_intel *)ucode_ptr;

mc_size = get_totalsize(mc_header);

2015-07-01 18:47:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 12/34] Btrfs: make xattr replace operations atomic

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Filipe Manana <[email protected]>

commit 5f5bc6b1e2d5a6f827bc860ef2dc5b6f365d1339 upstream.

Replacing a xattr consists of doing a lookup for its existing value, delete
the current value from the respective leaf, release the search path and then
finally insert the new value. This leaves a time window where readers (getxattr,
listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs,
so this has security implications.

This change also fixes 2 other existing issues which were:

*) Deleting the old xattr value without verifying first if the new xattr will
fit in the existing leaf item (in case multiple xattrs are packed in the
same item due to name hash collision);

*) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't
exist but we have have an existing item that packs muliple xattrs with
the same name hash as the input xattr. In this case we should return ENOSPC.

A test case for xfstests follows soon.

Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace
implementation.

Reported-by: Alexandre Oliva <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/ctree.c | 2
fs/btrfs/ctree.h | 5 +
fs/btrfs/dir-item.c | 10 +--
fs/btrfs/xattr.c | 150 ++++++++++++++++++++++++++++++++--------------------
4 files changed, 102 insertions(+), 65 deletions(-)

--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2963,7 +2963,7 @@ done:
*/
if (!p->leave_spinning)
btrfs_set_path_blocking(p);
- if (ret < 0)
+ if (ret < 0 && !p->skip_release_on_error)
btrfs_release_path(p);
return ret;
}
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -608,6 +608,7 @@ struct btrfs_path {
unsigned int skip_locking:1;
unsigned int leave_spinning:1;
unsigned int search_commit_root:1;
+ unsigned int skip_release_on_error:1;
};

/*
@@ -3609,6 +3610,10 @@ struct btrfs_dir_item *btrfs_lookup_xatt
int verify_dir_item(struct btrfs_root *root,
struct extent_buffer *leaf,
struct btrfs_dir_item *dir_item);
+struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
+ struct btrfs_path *path,
+ const char *name,
+ int name_len);

/* orphan.c */
int btrfs_insert_orphan_item(struct btrfs_trans_handle *trans,
--- a/fs/btrfs/dir-item.c
+++ b/fs/btrfs/dir-item.c
@@ -21,10 +21,6 @@
#include "hash.h"
#include "transaction.h"

-static struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
- struct btrfs_path *path,
- const char *name, int name_len);
-
/*
* insert a name into a directory, doing overflow properly if there is a hash
* collision. data_size indicates how big the item inserted should be. On
@@ -383,9 +379,9 @@ struct btrfs_dir_item *btrfs_lookup_xatt
* this walks through all the entries in a dir item and finds one
* for a specific name.
*/
-static struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
- struct btrfs_path *path,
- const char *name, int name_len)
+struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_root *root,
+ struct btrfs_path *path,
+ const char *name, int name_len)
{
struct btrfs_dir_item *dir_item;
unsigned long name_ptr;
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -29,6 +29,7 @@
#include "xattr.h"
#include "disk-io.h"
#include "props.h"
+#include "locking.h"


ssize_t __btrfs_getxattr(struct inode *inode, const char *name,
@@ -91,7 +92,7 @@ static int do_setxattr(struct btrfs_tran
struct inode *inode, const char *name,
const void *value, size_t size, int flags)
{
- struct btrfs_dir_item *di;
+ struct btrfs_dir_item *di = NULL;
struct btrfs_root *root = BTRFS_I(inode)->root;
struct btrfs_path *path;
size_t name_len = strlen(name);
@@ -103,84 +104,119 @@ static int do_setxattr(struct btrfs_tran
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
+ path->skip_release_on_error = 1;
+
+ if (!value) {
+ di = btrfs_lookup_xattr(trans, root, path, btrfs_ino(inode),
+ name, name_len, -1);
+ if (!di && (flags & XATTR_REPLACE))
+ ret = -ENODATA;
+ else if (di)
+ ret = btrfs_delete_one_dir_name(trans, root, path, di);
+ goto out;
+ }

+ /*
+ * For a replace we can't just do the insert blindly.
+ * Do a lookup first (read-only btrfs_search_slot), and return if xattr
+ * doesn't exist. If it exists, fall down below to the insert/replace
+ * path - we can't race with a concurrent xattr delete, because the VFS
+ * locks the inode's i_mutex before calling setxattr or removexattr.
+ */
if (flags & XATTR_REPLACE) {
- di = btrfs_lookup_xattr(trans, root, path, btrfs_ino(inode), name,
- name_len, -1);
- if (IS_ERR(di)) {
- ret = PTR_ERR(di);
- goto out;
- } else if (!di) {
+ ASSERT(mutex_is_locked(&inode->i_mutex));
+ di = btrfs_lookup_xattr(NULL, root, path, btrfs_ino(inode),
+ name, name_len, 0);
+ if (!di) {
ret = -ENODATA;
goto out;
}
- ret = btrfs_delete_one_dir_name(trans, root, path, di);
- if (ret)
- goto out;
btrfs_release_path(path);
+ di = NULL;
+ }

+ ret = btrfs_insert_xattr_item(trans, root, path, btrfs_ino(inode),
+ name, name_len, value, size);
+ if (ret == -EOVERFLOW) {
/*
- * remove the attribute
+ * We have an existing item in a leaf, split_leaf couldn't
+ * expand it. That item might have or not a dir_item that
+ * matches our target xattr, so lets check.
*/
- if (!value)
- goto out;
- } else {
- di = btrfs_lookup_xattr(NULL, root, path, btrfs_ino(inode),
- name, name_len, 0);
- if (IS_ERR(di)) {
- ret = PTR_ERR(di);
+ ret = 0;
+ btrfs_assert_tree_locked(path->nodes[0]);
+ di = btrfs_match_dir_item_name(root, path, name, name_len);
+ if (!di && !(flags & XATTR_REPLACE)) {
+ ret = -ENOSPC;
goto out;
}
- if (!di && !value)
- goto out;
- btrfs_release_path(path);
+ } else if (ret == -EEXIST) {
+ ret = 0;
+ di = btrfs_match_dir_item_name(root, path, name, name_len);
+ ASSERT(di); /* logic error */
+ } else if (ret) {
+ goto out;
}

-again:
- ret = btrfs_insert_xattr_item(trans, root, path, btrfs_ino(inode),
- name, name_len, value, size);
- /*
- * If we're setting an xattr to a new value but the new value is say
- * exactly BTRFS_MAX_XATTR_SIZE, we could end up with EOVERFLOW getting
- * back from split_leaf. This is because it thinks we'll be extending
- * the existing item size, but we're asking for enough space to add the
- * item itself. So if we get EOVERFLOW just set ret to EEXIST and let
- * the rest of the function figure it out.
- */
- if (ret == -EOVERFLOW)
+ if (di && (flags & XATTR_CREATE)) {
ret = -EEXIST;
+ goto out;
+ }

- if (ret == -EEXIST) {
- if (flags & XATTR_CREATE)
- goto out;
+ if (di) {
/*
- * We can't use the path we already have since we won't have the
- * proper locking for a delete, so release the path and
- * re-lookup to delete the thing.
+ * We're doing a replace, and it must be atomic, that is, at
+ * any point in time we have either the old or the new xattr
+ * value in the tree. We don't want readers (getxattr and
+ * listxattrs) to miss a value, this is specially important
+ * for ACLs.
*/
- btrfs_release_path(path);
- di = btrfs_lookup_xattr(trans, root, path, btrfs_ino(inode),
- name, name_len, -1);
- if (IS_ERR(di)) {
- ret = PTR_ERR(di);
- goto out;
- } else if (!di) {
- /* Shouldn't happen but just in case... */
- btrfs_release_path(path);
- goto again;
+ const int slot = path->slots[0];
+ struct extent_buffer *leaf = path->nodes[0];
+ const u16 old_data_len = btrfs_dir_data_len(leaf, di);
+ const u32 item_size = btrfs_item_size_nr(leaf, slot);
+ const u32 data_size = sizeof(*di) + name_len + size;
+ struct btrfs_item *item;
+ unsigned long data_ptr;
+ char *ptr;
+
+ if (size > old_data_len) {
+ if (btrfs_leaf_free_space(root, leaf) <
+ (size - old_data_len)) {
+ ret = -ENOSPC;
+ goto out;
+ }
}

- ret = btrfs_delete_one_dir_name(trans, root, path, di);
- if (ret)
- goto out;
+ if (old_data_len + name_len + sizeof(*di) == item_size) {
+ /* No other xattrs packed in the same leaf item. */
+ if (size > old_data_len)
+ btrfs_extend_item(root, path,
+ size - old_data_len);
+ else if (size < old_data_len)
+ btrfs_truncate_item(root, path, data_size, 1);
+ } else {
+ /* There are other xattrs packed in the same item. */
+ ret = btrfs_delete_one_dir_name(trans, root, path, di);
+ if (ret)
+ goto out;
+ btrfs_extend_item(root, path, data_size);
+ }

+ item = btrfs_item_nr(slot);
+ ptr = btrfs_item_ptr(leaf, slot, char);
+ ptr += btrfs_item_size(leaf, item) - data_size;
+ di = (struct btrfs_dir_item *)ptr;
+ btrfs_set_dir_data_len(leaf, di, size);
+ data_ptr = ((unsigned long)(di + 1)) + name_len;
+ write_extent_buffer(leaf, value, data_ptr, size);
+ btrfs_mark_buffer_dirty(leaf);
+ } else {
/*
- * We have a value to set, so go back and try to insert it now.
+ * Insert, and we had space for the xattr, so path->slots[0] is
+ * where our xattr dir_item is and btrfs_insert_xattr_item()
+ * filled it.
*/
- if (value) {
- btrfs_release_path(path);
- goto again;
- }
}
out:
btrfs_free_path(path);

2015-07-01 18:47:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 13/34] net/mlx4_en: Dont attempt to TX offload the outer UDP checksum for VXLAN

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Or Gerlitz <[email protected]>

commit a4f2dacbf2a5045e34b98a35d9a3857800f25a7b upstream.

For VXLAN/NVGRE encapsulation, the current HW doesn't support offloading
both the outer UDP TX checksum and the inner TCP/UDP TX checksum.

The driver doesn't advertize SKB_GSO_UDP_TUNNEL_CSUM, however we are wrongly
telling the HW to offload the outer UDP checksum for encapsulated packets,
fix that.

Fixes: 837052d0ccc5 ('net/mlx4_en: Add netdev support for TCP/IP
offloads of vxlan tunneling')
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -810,8 +810,11 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff
tx_desc->ctrl.fence_size = (real_size / 16) & 0x3f;
tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
- tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
- MLX4_WQE_CTRL_TCP_UDP_CSUM);
+ if (!skb->encapsulation)
+ tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM |
+ MLX4_WQE_CTRL_TCP_UDP_CSUM);
+ else
+ tx_desc->ctrl.srcrb_flags |= cpu_to_be32(MLX4_WQE_CTRL_IP_CSUM);
ring->tx_csum++;
}


2015-07-01 18:46:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 14/34] splice: Apply generic position and size checks to each write

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <[email protected]>

commit 894c6350eaad7e613ae267504014a456e00a3e2a from the 3.2-stable branch.

We need to check the position and size of file writes against various
limits, using generic_write_check(). This was not being done for
the splice write path. It was fixed upstream by commit 8d0207652cbe
("->splice_write() via ->write_iter()") but we can't apply that.

CVE-2014-7822

Signed-off-by: Ben Hutchings <[email protected]>
Cc: Vinson Lee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ocfs2/file.c | 8 ++++++--
fs/splice.c | 8 ++++++--
2 files changed, 12 insertions(+), 4 deletions(-)

--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2478,9 +2478,7 @@ static ssize_t ocfs2_file_splice_write(s
struct address_space *mapping = out->f_mapping;
struct inode *inode = mapping->host;
struct splice_desc sd = {
- .total_len = len,
.flags = flags,
- .pos = *ppos,
.u.file = out,
};

@@ -2490,6 +2488,12 @@ static ssize_t ocfs2_file_splice_write(s
out->f_path.dentry->d_name.len,
out->f_path.dentry->d_name.name, len);

+ ret = generic_write_checks(out, ppos, &len, 0);
+ if (ret)
+ return ret;
+ sd.total_len = len;
+ sd.pos = *ppos;
+
pipe_lock(pipe);

splice_from_pipe_begin(&sd);
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1012,13 +1012,17 @@ generic_file_splice_write(struct pipe_in
struct address_space *mapping = out->f_mapping;
struct inode *inode = mapping->host;
struct splice_desc sd = {
- .total_len = len,
.flags = flags,
- .pos = *ppos,
.u.file = out,
};
ssize_t ret;

+ ret = generic_write_checks(out, ppos, &len, S_ISBLK(inode->i_mode));
+ if (ret)
+ return ret;
+ sd.total_len = len;
+ sd.pos = *ppos;
+
pipe_lock(pipe);

splice_from_pipe_begin(&sd);

2015-07-01 18:46:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 15/34] ARM: clk-imx6q: refine satas parent

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sebastien Szymanski <[email protected]>

commit da946aeaeadcd24ff0cda9984c6fb8ed2bfd462a upstream.

According to IMX6D/Q RM, table 18-3, sata clock's parent is ahb, not ipg.

Signed-off-by: Sebastien Szymanski <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
[dirk.behme: Adjust moved file]
Signed-off-by: Dirk Behme <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/mach-imx/clk-imx6q.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/mach-imx/clk-imx6q.c
+++ b/arch/arm/mach-imx/clk-imx6q.c
@@ -406,7 +406,7 @@ static void __init imx6q_clocks_init(str
clk[gpmi_io] = imx_clk_gate2("gpmi_io", "enfc", base + 0x78, 28);
clk[gpmi_apb] = imx_clk_gate2("gpmi_apb", "usdhc3", base + 0x78, 30);
clk[rom] = imx_clk_gate2("rom", "ahb", base + 0x7c, 0);
- clk[sata] = imx_clk_gate2("sata", "ipg", base + 0x7c, 4);
+ clk[sata] = imx_clk_gate2("sata", "ahb", base + 0x7c, 4);
clk[sdma] = imx_clk_gate2("sdma", "ahb", base + 0x7c, 6);
clk[spba] = imx_clk_gate2("spba", "ipg", base + 0x7c, 12);
clk[spdif] = imx_clk_gate2("spdif", "spdif_podf", base + 0x7c, 14);

2015-07-01 18:44:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 16/34] KVM: nSVM: Check for NRIPS support before updating control field

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bandan Das <[email protected]>

commit f104765b4f81fd74d69e0eb161e89096deade2db upstream.

If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.

Signed-off-by: Bandan Das <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/svm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -495,8 +495,10 @@ static void skip_emulated_instruction(st
{
struct vcpu_svm *svm = to_svm(vcpu);

- if (svm->vmcb->control.next_rip != 0)
+ if (svm->vmcb->control.next_rip != 0) {
+ WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
svm->next_rip = svm->vmcb->control.next_rip;
+ }

if (!svm->next_rip) {
if (emulate_instruction(vcpu, EMULTYPE_SKIP) !=
@@ -4246,7 +4248,9 @@ static int svm_check_intercept(struct kv
break;
}

- vmcb->control.next_rip = info->next_rip;
+ /* TODO: Advertise NRIPS to guest hypervisor unconditionally */
+ if (static_cpu_has(X86_FEATURE_NRIPS))
+ vmcb->control.next_rip = info->next_rip;
vmcb->control.exit_code = icpt_info.exit_code;
vmexit = nested_svm_exit_handled(svm);


2015-07-01 18:44:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 17/34] bus: mvebu: pass the coherency availability information at init time

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Greg Ungerer <[email protected]>

commit 5686a1e5aa436c49187a60052d5885fb1f541ce6 upstream.

Until now, the mvebu-mbus was guessing by itself whether hardware I/O
coherency was available or not by poking into the Device Tree to see
if the coherency fabric Device Tree node was present or not.

However, on some upcoming SoCs, the presence or absence of the
coherency fabric DT node isn't sufficient: in CONFIG_SMP, the
coherency can be enabled, but not in !CONFIG_SMP.

In order to clean this up, the mvebu_mbus_dt_init() function is
extended to get a boolean argument telling whether coherency is
enabled or not. Therefore, the logic to decide whether coherency is
available or not now belongs to the core SoC code instead of the
mvebu-mbus driver itself, which is much better.

Signed-off-by: Thomas Petazzoni <[email protected]>
Link: https://lkml.kernel.org/r/1397483228-25625-4-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper <[email protected]>

[ Greg Ungerer: back ported to linux-3.14.y
Back port necessary due to large code differences in affected files.
This change in combination with commit e553554536 ("ARM: mvebu: disable
I/O coherency on non-SMP situations on Armada 370/375/38x/XP") is
critical to the hardware I/O coherency being set correctly by both the
mbus driver and all peripheral hardware drivers. Without this change
drivers will incorrectly enable I/O coherency window attributes and
this causes rare unreliable system behavior including oops. ]

Signed-off-by: Greg Ungerer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/mach-dove/board-dt.c | 2 +-
arch/arm/mach-kirkwood/board-dt.c | 2 +-
arch/arm/mach-mvebu/armada-370-xp.c | 2 +-
arch/arm/mach-mvebu/coherency.c | 15 +++++++++++++++
arch/arm/mach-mvebu/coherency.h | 1 +
drivers/bus/mvebu-mbus.c | 11 +++--------
include/linux/mbus.h | 2 +-
7 files changed, 23 insertions(+), 12 deletions(-)

--- a/arch/arm/mach-dove/board-dt.c
+++ b/arch/arm/mach-dove/board-dt.c
@@ -26,7 +26,7 @@ static void __init dove_dt_init(void)
#ifdef CONFIG_CACHE_TAUROS2
tauros2_init(0);
#endif
- BUG_ON(mvebu_mbus_dt_init());
+ BUG_ON(mvebu_mbus_dt_init(false));
of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
}

--- a/arch/arm/mach-kirkwood/board-dt.c
+++ b/arch/arm/mach-kirkwood/board-dt.c
@@ -116,7 +116,7 @@ static void __init kirkwood_dt_init(void
*/
writel(readl(CPU_CONFIG) & ~CPU_CONFIG_ERROR_PROP, CPU_CONFIG);

- BUG_ON(mvebu_mbus_dt_init());
+ BUG_ON(mvebu_mbus_dt_init(false));

kirkwood_l2_init();

--- a/arch/arm/mach-mvebu/armada-370-xp.c
+++ b/arch/arm/mach-mvebu/armada-370-xp.c
@@ -41,7 +41,7 @@ static void __init armada_370_xp_timer_a
of_clk_init(NULL);
clocksource_of_init();
coherency_init();
- BUG_ON(mvebu_mbus_dt_init());
+ BUG_ON(mvebu_mbus_dt_init(coherency_available()));
#ifdef CONFIG_CACHE_L2X0
l2x0_of_init(0, ~0UL);
#endif
--- a/arch/arm/mach-mvebu/coherency.c
+++ b/arch/arm/mach-mvebu/coherency.c
@@ -121,6 +121,20 @@ static struct notifier_block mvebu_hwcc_
.notifier_call = mvebu_hwcc_platform_notifier,
};

+/*
+ * Keep track of whether we have IO hardware coherency enabled or not.
+ * On Armada 370's we will not be using it for example. We need to make
+ * that available [through coherency_available()] so the mbus controller
+ * doesn't enable the IO coherency bit in the attribute bits of the
+ * chip selects.
+ */
+static int coherency_enabled;
+
+int coherency_available(void)
+{
+ return coherency_enabled;
+}
+
int __init coherency_init(void)
{
struct device_node *np;
@@ -164,6 +178,7 @@ int __init coherency_init(void)
coherency_base = of_iomap(np, 0);
coherency_cpu_base = of_iomap(np, 1);
set_cpu_coherent(cpu_logical_map(smp_processor_id()), 0);
+ coherency_enabled = 1;
of_node_put(np);
}

--- a/arch/arm/mach-mvebu/coherency.h
+++ b/arch/arm/mach-mvebu/coherency.h
@@ -17,6 +17,7 @@
extern unsigned long coherency_phys_base;

int set_cpu_coherent(unsigned int cpu_id, int smp_group_id);
+int coherency_available(void);
int coherency_init(void);

#endif /* __MACH_370_XP_COHERENCY_H */
--- a/drivers/bus/mvebu-mbus.c
+++ b/drivers/bus/mvebu-mbus.c
@@ -701,7 +701,6 @@ static int __init mvebu_mbus_common_init
phys_addr_t sdramwins_phys_base,
size_t sdramwins_size)
{
- struct device_node *np;
int win;

mbus->mbuswins_base = ioremap(mbuswins_phys_base, mbuswins_size);
@@ -714,12 +713,6 @@ static int __init mvebu_mbus_common_init
return -ENOMEM;
}

- np = of_find_compatible_node(NULL, NULL, "marvell,coherency-fabric");
- if (np) {
- mbus->hw_io_coherency = 1;
- of_node_put(np);
- }
-
for (win = 0; win < mbus->soc->num_wins; win++)
mvebu_mbus_disable_window(mbus, win);

@@ -889,7 +882,7 @@ static void __init mvebu_mbus_get_pcie_r
}
}

-int __init mvebu_mbus_dt_init(void)
+int __init mvebu_mbus_dt_init(bool is_coherent)
{
struct resource mbuswins_res, sdramwins_res;
struct device_node *np, *controller;
@@ -928,6 +921,8 @@ int __init mvebu_mbus_dt_init(void)
return -EINVAL;
}

+ mbus_state.hw_io_coherency = is_coherent;
+
/* Get optional pcie-{mem,io}-aperture properties */
mvebu_mbus_get_pcie_resources(np, &mbus_state.pcie_mem_aperture,
&mbus_state.pcie_io_aperture);
--- a/include/linux/mbus.h
+++ b/include/linux/mbus.h
@@ -73,6 +73,6 @@ int mvebu_mbus_del_window(phys_addr_t ba
int mvebu_mbus_init(const char *soc, phys_addr_t mbus_phys_base,
size_t mbus_size, phys_addr_t sdram_phys_base,
size_t sdram_size);
-int mvebu_mbus_dt_init(void);
+int mvebu_mbus_dt_init(bool is_coherent);

#endif /* __LINUX_MBUS_H */

2015-07-01 18:48:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 18/34] ARM/arm64: KVM: fix use of WnR bit in kvm_is_write_fault()

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ard Biesheuvel <[email protected]>

commit a7d079cea2dffb112e26da2566dd84c0ef1fce97 upstream.

[Since we don't backport commit 9804788 (arm/arm64: KVM: Support
KVM_CAP_READONLY_MEM), ingore the changes in kvm_handle_guest_abort
introduced by this patch.]

The ISS encoding for an exception from a Data Abort has a WnR
bit[6] that indicates whether the Data Abort was caused by a
read or a write instruction. While there are several fields
in the encoding that are only valid if the ISV bit[24] is set,
WnR is not one of them, so we can read it unconditionally.

Instead of fixing both implementations of kvm_is_write_fault()
in place, reimplement it just once using kvm_vcpu_dabt_iswrite(),
which already does the right thing with respect to the WnR bit.
Also fix up the callers to pass 'vcpu'

Acked-by: Laszlo Ersek <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Acked-by: Christoffer Dall <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/include/asm/kvm_mmu.h | 11 -----------
arch/arm/kvm/mmu.c | 10 +++++++++-
arch/arm64/include/asm/kvm_mmu.h | 13 -------------
3 files changed, 9 insertions(+), 25 deletions(-)

--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -78,17 +78,6 @@ static inline void kvm_set_pte(pte_t *pt
flush_pmd_entry(pte);
}

-static inline bool kvm_is_write_fault(unsigned long hsr)
-{
- unsigned long hsr_ec = hsr >> HSR_EC_SHIFT;
- if (hsr_ec == HSR_EC_IABT)
- return false;
- else if ((hsr & HSR_ISV) && !(hsr & HSR_WNR))
- return false;
- else
- return true;
-}
-
static inline void kvm_clean_pgd(pgd_t *pgd)
{
clean_dcache_area(pgd, PTRS_PER_S2_PGD * sizeof(pgd_t));
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -746,6 +746,14 @@ static bool transparent_hugepage_adjust(
return false;
}

+static bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
+{
+ if (kvm_vcpu_trap_is_iabt(vcpu))
+ return false;
+
+ return kvm_vcpu_dabt_iswrite(vcpu);
+}
+
static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_memory_slot *memslot,
unsigned long fault_status)
@@ -761,7 +769,7 @@ static int user_mem_abort(struct kvm_vcp
pfn_t pfn;
pgprot_t mem_type = PAGE_S2;

- write_fault = kvm_is_write_fault(kvm_vcpu_get_hsr(vcpu));
+ write_fault = kvm_is_write_fault(vcpu);
if (fault_status == FSC_PERM && !write_fault) {
kvm_err("Unexpected L2 read permission error\n");
return -EFAULT;
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -93,19 +93,6 @@ void kvm_clear_hyp_idmap(void);
#define kvm_set_pte(ptep, pte) set_pte(ptep, pte)
#define kvm_set_pmd(pmdp, pmd) set_pmd(pmdp, pmd)

-static inline bool kvm_is_write_fault(unsigned long esr)
-{
- unsigned long esr_ec = esr >> ESR_EL2_EC_SHIFT;
-
- if (esr_ec == ESR_EL2_EC_IABT)
- return false;
-
- if ((esr & ESR_EL2_ISV) && !(esr & ESR_EL2_WNR))
- return false;
-
- return true;
-}
-
static inline void kvm_clean_pgd(pgd_t *pgd) {}
static inline void kvm_clean_pmd_entry(pmd_t *pmd) {}
static inline void kvm_clean_pte(pte_t *pte) {}

2015-07-01 18:48:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 19/34] KVM: ARM: vgic: plug irq injection race

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <[email protected]>

commit 71afaba4a2e98bb7bdeba5078370ab43d46e67a1 upstream.

[Since we don't backport commit 227844f (arm/arm64: KVM: Rename irq_state
to irq_pending) for linux-3.14.y, here we still use vgic_update_irq_state
instead of vgic_update_irq_pending.]

As it stands, nothing prevents userspace from injecting an interrupt
before the guest's GIC is actually initialized.

This goes unnoticed so far (as everything is pretty much statically
allocated), but ends up exploding in a spectacular way once we switch
to a more dynamic allocation (the GIC data structure isn't there yet).

The fix is to test for the "ready" flag in the VGIC distributor before
trying to inject the interrupt. Note that in order to avoid breaking
userspace, we have to ignore what is essentially an error.

Signed-off-by: Marc Zyngier <[email protected]>
Acked-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
virt/kvm/arm/vgic.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1387,7 +1387,8 @@ out:
int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int irq_num,
bool level)
{
- if (vgic_update_irq_state(kvm, cpuid, irq_num, level))
+ if (likely(vgic_initialized(kvm)) &&
+ vgic_update_irq_state(kvm, cpuid, irq_num, level))
vgic_kick_vcpus(kvm);

return 0;

2015-07-01 18:48:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 20/34] arm/arm64: KVM: Fix set_clear_sgi_pend_reg offset

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit 0fea6d7628ed6e25a9ee1b67edf7c859718d39e8 upstream.

The sgi values calculated in read_set_clear_sgi_pend_reg() and
write_set_clear_sgi_pend_reg() were horribly incorrectly multiplied by 4
with catastrophic results in that subfunctions ended up overwriting
memory not allocated for the expected purpose.

This showed up as bugs in kfree() and the kernel complaining a lot of
you turn on memory debugging.

This addresses: http://marc.info/?l=kvm&m=141164910007868&w=2

Reported-by: Shannon Zhao <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
virt/kvm/arm/vgic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -674,7 +674,7 @@ static bool read_set_clear_sgi_pend_reg(
{
struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
int sgi;
- int min_sgi = (offset & ~0x3) * 4;
+ int min_sgi = (offset & ~0x3);
int max_sgi = min_sgi + 3;
int vcpu_id = vcpu->vcpu_id;
u32 reg = 0;
@@ -695,7 +695,7 @@ static bool write_set_clear_sgi_pend_reg
{
struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
int sgi;
- int min_sgi = (offset & ~0x3) * 4;
+ int min_sgi = (offset & ~0x3);
int max_sgi = min_sgi + 3;
int vcpu_id = vcpu->vcpu_id;
u32 reg;

2015-07-01 18:48:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 21/34] arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Joel Schopp <[email protected]>

commit dbff124e29fa24aff9705b354b5f4648cd96e0bb upstream.

The current aarch64 calculation for VTTBR_BADDR_MASK masks only 39 bits
and not all the bits in the PA range. This is clearly a bug that
manifests itself on systems that allocate memory in the higher address
space range.

[ Modified from Joel's original patch to be based on PHYS_MASK_SHIFT
instead of a hard-coded value and to move the alignment check of the
allocation to mmu.c. Also added a comment explaining why we hardcode
the IPA range and changed the stage-2 pgd allocation to be based on
the 40 bit IPA range instead of the maximum possible 48 bit PA range.
- Christoffer ]

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Joel Schopp <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/arm.c | 4 ++--
arch/arm64/include/asm/kvm_arm.h | 13 ++++++++++++-
arch/arm64/include/asm/kvm_mmu.h | 5 ++---
3 files changed, 16 insertions(+), 6 deletions(-)

--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -427,9 +427,9 @@ static void update_vttbr(struct kvm *kvm

/* update vttbr to be used with the new vmid */
pgd_phys = virt_to_phys(kvm->arch.pgd);
+ BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK;
- kvm->arch.vttbr = pgd_phys & VTTBR_BADDR_MASK;
- kvm->arch.vttbr |= vmid;
+ kvm->arch.vttbr = pgd_phys | vmid;

spin_unlock(&kvm_vmid_lock);
}
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -122,6 +122,17 @@
#define VTCR_EL2_T0SZ_MASK 0x3f
#define VTCR_EL2_T0SZ_40B 24

+/*
+ * We configure the Stage-2 page tables to always restrict the IPA space to be
+ * 40 bits wide (T0SZ = 24). Systems with a PARange smaller than 40 bits are
+ * not known to exist and will break with this configuration.
+ *
+ * Note that when using 4K pages, we concatenate two first level page tables
+ * together.
+ *
+ * The magic numbers used for VTTBR_X in this patch can be found in Tables
+ * D4-23 and D4-25 in ARM DDI 0487A.b.
+ */
#ifdef CONFIG_ARM64_64K_PAGES
/*
* Stage2 translation configuration:
@@ -151,7 +162,7 @@
#endif

#define VTTBR_BADDR_SHIFT (VTTBR_X - 1)
-#define VTTBR_BADDR_MASK (((1LLU << (40 - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
+#define VTTBR_BADDR_MASK (((1LLU << (PHYS_MASK_SHIFT - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
#define VTTBR_VMID_SHIFT (48LLU)
#define VTTBR_VMID_MASK (0xffLLU << VTTBR_VMID_SHIFT)

--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -59,10 +59,9 @@
#define KERN_TO_HYP(kva) ((unsigned long)kva - PAGE_OFFSET + HYP_PAGE_OFFSET)

/*
- * Align KVM with the kernel's view of physical memory. Should be
- * 40bit IPA, with PGD being 8kB aligned in the 4KB page configuration.
+ * We currently only support a 40bit IPA.
*/
-#define KVM_PHYS_SHIFT PHYS_MASK_SHIFT
+#define KVM_PHYS_SHIFT (40)
#define KVM_PHYS_SIZE (1UL << KVM_PHYS_SHIFT)
#define KVM_PHYS_MASK (KVM_PHYS_SIZE - 1UL)


2015-07-01 18:48:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 22/34] arm: kvm: fix CPU hotplug

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vladimir Murzin <[email protected]>

commit 37a34ac1d4775aafbc73b9db53c7daebbbc67e6a upstream.

On some platforms with no power management capabilities, the hotplug
implementation is allowed to return from a smp_ops.cpu_die() call as a
function return. Upon a CPU onlining event, the KVM CPU notifier tries
to reinstall the hyp stub, which fails on platform where no reset took
place following a hotplug event, with the message:

CPU1: smp_ops.cpu_die() returned, trying to resuscitate
CPU1: Booted secondary processor
Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x80409540
unexpected data abort in Hyp mode at: 0x80401fe8
unexpected HVC/SVC trap in Hyp mode at: 0x805c6170

since KVM code is trying to reinstall the stub on a system where it is
already configured.

To prevent this issue, this patch adds a check in the KVM hotplug
notifier that detects if the HYP stub really needs re-installing when a
CPU is onlined and skips the installation call if the stub is already in
place, which means that the CPU has not been reset.

Signed-off-by: Vladimir Murzin <[email protected]>
Acked-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/arm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -825,7 +825,8 @@ static int hyp_init_cpu_notify(struct no
switch (action) {
case CPU_STARTING:
case CPU_STARTING_FROZEN:
- cpu_init_hyp_mode(NULL);
+ if (__hyp_get_vectors() == hyp_default_vectors)
+ cpu_init_hyp_mode(NULL);
break;
}


2015-07-01 18:42:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 23/34] arm/arm64: KVM: fix potential NULL dereference in user_mem_abort()

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ard Biesheuvel <[email protected]>

commit 37b544087ef3f65ca68465ba39291a07195dac26 upstream.

Handle the potential NULL return value of find_vma_intersection()
before dereferencing it.

Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/mmu.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -778,6 +778,12 @@ static int user_mem_abort(struct kvm_vcp
/* Let's check if we will get back a huge page backed by hugetlbfs */
down_read(&current->mm->mmap_sem);
vma = find_vma_intersection(current->mm, hva, hva + 1);
+ if (unlikely(!vma)) {
+ kvm_err("Failed to find VMA for hva 0x%lx\n", hva);
+ up_read(&current->mm->mmap_sem);
+ return -EFAULT;
+ }
+
if (is_vm_hugetlb_page(vma)) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;

2015-07-01 18:42:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 24/34] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit c3058d5da2222629bc2223c488a4512b59bb4baf upstream.

[Since we don't backport commit 8eef912 (arm/arm64: KVM: map MMIO regions
at creation time) for linux-3.14.y, the context of this patch is
different, while the change itself is same.]

When creating or moving a memslot, make sure the IPA space is within the
addressable range of the guest. Otherwise, user space can create too
large a memslot and KVM would try to access potentially unallocated page
table entries when inserting entries in the Stage-2 page tables.

Acked-by: Catalin Marinas <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/kvm/mmu.c | 11 +++++++++++
1 file changed, 11 insertions(+)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -926,6 +926,9 @@ int kvm_handle_guest_abort(struct kvm_vc

memslot = gfn_to_memslot(vcpu->kvm, gfn);

+ /* Userspace should not be able to register out-of-bounds IPAs */
+ VM_BUG_ON(fault_ipa >= KVM_PHYS_SIZE);
+
ret = user_mem_abort(vcpu, fault_ipa, memslot, fault_status);
if (ret == 0)
ret = 1;
@@ -1150,6 +1153,14 @@ int kvm_arch_prepare_memory_region(struc
struct kvm_userspace_memory_region *mem,
enum kvm_mr_change change)
{
+ /*
+ * Prevent userspace from creating a memory region outside of the IPA
+ * space addressable by the KVM guest IPA space.
+ */
+ if (memslot->base_gfn + memslot->npages >=
+ (KVM_PHYS_SIZE >> PAGE_SHIFT))
+ return -EFAULT;
+
return 0;
}


2015-07-01 18:42:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 25/34] arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steve Capper <[email protected]>

commit 3d08c629244257473450a8ba17cb8184b91e68f8 upstream.

Commit:
b886576 ARM: KVM: user_mem_abort: support stage 2 MMIO page mapping

introduced some code in user_mem_abort that failed to compile if
STRICT_MM_TYPECHECKS was enabled.

This patch fixes up the failing comparison.

Signed-off-by: Steve Capper <[email protected]>
Reviewed-by: Kim Phillips <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -850,7 +850,7 @@ static int user_mem_abort(struct kvm_vcp
}
coherent_cache_guest_page(vcpu, hva, PAGE_SIZE);
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
- mem_type == PAGE_S2_DEVICE);
+ pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
}



2015-07-01 18:44:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 26/34] arm64: KVM: fix unmapping with 48-bit VAs

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mark Rutland <[email protected]>

commit 7cbb87d67e38cfc55680290a706fd7517f10050d upstream.

Currently if using a 48-bit VA, tearing down the hyp page tables (which
can happen in the absence of a GICH or GICV resource) results in the
rather nasty splat below, evidently becasue we access a table that
doesn't actually exist.

Commit 38f791a4e499792e (arm64: KVM: Implement 48 VA support for KVM EL2
and Stage-2) added a pgd_none check to __create_hyp_mappings to account
for the additional level of tables, but didn't add a corresponding check
to unmap_range, and this seems to be the source of the problem.

This patch adds the missing pgd_none check, ensuring we don't try to
access tables that don't exist.

Original splat below:

kvm [1]: Using HYP init bounce page @83fe94a000
kvm [1]: Cannot obtain GICH resource
Unable to handle kernel paging request at virtual address ffff7f7fff000000
pgd = ffff800000770000
[ffff7f7fff000000] *pgd=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.18.0-rc2+ #89
task: ffff8003eb500000 ti: ffff8003eb45c000 task.ti: ffff8003eb45c000
PC is at unmap_range+0x120/0x580
LR is at free_hyp_pgds+0xac/0xe4
pc : [<ffff80000009b768>] lr : [<ffff80000009cad8>] pstate: 80000045
sp : ffff8003eb45fbf0
x29: ffff8003eb45fbf0 x28: ffff800000736000
x27: ffff800000735000 x26: ffff7f7fff000000
x25: 0000000040000000 x24: ffff8000006f5000
x23: 0000000000000000 x22: 0000007fffffffff
x21: 0000800000000000 x20: 0000008000000000
x19: 0000000000000000 x18: ffff800000648000
x17: ffff800000537228 x16: 0000000000000000
x15: 000000000000001f x14: 0000000000000000
x13: 0000000000000001 x12: 0000000000000020
x11: 0000000000000062 x10: 0000000000000006
x9 : 0000000000000000 x8 : 0000000000000063
x7 : 0000000000000018 x6 : 00000003ff000000
x5 : ffff800000744188 x4 : 0000000000000001
x3 : 0000000040000000 x2 : ffff800000000000
x1 : 0000007fffffffff x0 : 000000003fffffff

Process swapper/0 (pid: 1, stack limit = 0xffff8003eb45c058)
Stack: (0xffff8003eb45fbf0 to 0xffff8003eb460000)
fbe0: eb45fcb0 ffff8003 0009cad8 ffff8000
fc00: 00000000 00000080 00736140 ffff8000 00736000 ffff8000 00000000 00007c80
fc20: 00000000 00000080 006f5000 ffff8000 00000000 00000080 00743000 ffff8000
fc40: 00735000 ffff8000 006d3030 ffff8000 006fe7b8 ffff8000 00000000 00000080
fc60: ffffffff 0000007f fdac1000 ffff8003 fd94b000 ffff8003 fda47000 ffff8003
fc80: 00502b40 ffff8000 ff000000 ffff7f7f fdec6000 00008003 fdac1630 ffff8003
fca0: eb45fcb0 ffff8003 ffffffff 0000007f eb45fd00 ffff8003 0009b378 ffff8000
fcc0: ffffffea 00000000 006fe000 ffff8000 00736728 ffff8000 00736120 ffff8000
fce0: 00000040 00000000 00743000 ffff8000 006fe7b8 ffff8000 0050cd48 00000000
fd00: eb45fd60 ffff8003 00096070 ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
fd20: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00000000 00000000
fd40: 00000ae0 00000000 006aa25c ffff8000 eb45fd60 ffff8003 0017ca44 00000002
fd60: eb45fdc0 ffff8003 0009a33c ffff8000 006f06e0 ffff8000 006f06e0 ffff8000
fd80: fd948b40 ffff8003 0009a320 ffff8000 00000000 00000000 00735000 ffff8000
fda0: 006d3090 ffff8000 006aa25c ffff8000 00735000 ffff8000 006d3030 ffff8000
fdc0: eb45fdd0 ffff8003 000814c0 ffff8000 eb45fe50 ffff8003 006aaac4 ffff8000
fde0: 006ddd90 ffff8000 00000006 00000000 006d3000 ffff8000 00000095 00000000
fe00: 006a1e90 ffff8000 00735000 ffff8000 006d3000 ffff8000 006aa25c ffff8000
fe20: 00735000 ffff8000 006d3030 ffff8000 eb45fe50 ffff8003 006fac68 ffff8000
fe40: 00000006 00000006 fe293ee6 ffff8003 eb45feb0 ffff8003 004f8ee8 ffff8000
fe60: 004f8ed4 ffff8000 00735000 ffff8000 00000000 00000000 00000000 00000000
fe80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
fea0: 00000000 00000000 00000000 00000000 00000000 00000000 000843d0 ffff8000
fec0: 004f8ed4 ffff8000 00000000 00000000 00000000 00000000 00000000 00000000
fee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff00: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000005 00000000
ffe0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call trace:
[<ffff80000009b768>] unmap_range+0x120/0x580
[<ffff80000009cad4>] free_hyp_pgds+0xa8/0xe4
[<ffff80000009b374>] kvm_arch_init+0x268/0x44c
[<ffff80000009606c>] kvm_init+0x24/0x260
[<ffff80000009a338>] arm_init+0x18/0x24
[<ffff8000000814bc>] do_one_initcall+0x88/0x1a0
[<ffff8000006aaac0>] kernel_init_freeable+0x148/0x1e8
[<ffff8000004f8ee4>] kernel_init+0x10/0xd4
Code: 8b000263 92628479 d1000720 eb01001f (f9400340)
---[ end trace 3bc230562e926fa4 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Signed-off-by: Mark Rutland <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jungseok Lee <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Acked-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/mmu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -197,7 +197,8 @@ static void unmap_range(struct kvm *kvm,
pgd = pgdp + pgd_index(addr);
do {
next = kvm_pgd_addr_end(addr, end);
- unmap_puds(kvm, pgd, addr, next);
+ if (!pgd_none(*pgd))
+ unmap_puds(kvm, pgd, addr, next);
} while (pgd++, addr = next, addr != end);
}


2015-07-01 18:44:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 27/34] arm/arm64: KVM: vgic: Fix error code in kvm_vgic_create()

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit 6b50f54064a02b77a7b990032b80234fee59bcd6 upstream.

If we detect another vCPU is running we just exit and return 0 as if we
succesfully created the VGIC, but the VGIC wouldn't actual be created.

This shouldn't break in-kernel behavior because the kernel will not
observe the failed the attempt to create the VGIC, but userspace could
be rightfully confused.

Cc: Andre Przywara <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
virt/kvm/arm/vgic.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1611,7 +1611,7 @@ out:

int kvm_vgic_create(struct kvm *kvm)
{
- int i, vcpu_lock_idx = -1, ret = 0;
+ int i, vcpu_lock_idx = -1, ret;
struct kvm_vcpu *vcpu;

mutex_lock(&kvm->lock);
@@ -1626,6 +1626,7 @@ int kvm_vgic_create(struct kvm *kvm)
* vcpu->mutex. By grabbing the vcpu->mutex of all VCPUs we ensure
* that no other VCPUs are run while we create the vgic.
*/
+ ret = -EBUSY;
kvm_for_each_vcpu(i, vcpu, kvm) {
if (!mutex_trylock(&vcpu->mutex))
goto out_unlock;
@@ -1633,11 +1634,10 @@ int kvm_vgic_create(struct kvm *kvm)
}

kvm_for_each_vcpu(i, vcpu, kvm) {
- if (vcpu->arch.has_run_once) {
- ret = -EBUSY;
+ if (vcpu->arch.has_run_once)
goto out_unlock;
- }
}
+ ret = 0;

spin_lock_init(&kvm->arch.vgic.lock);
kvm->arch.vgic.vctrl_base = vgic_vctrl_base;

2015-07-01 18:43:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 28/34] arm64/kvm: Fix assembler compatibility of macros

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Geoff Levand <[email protected]>

commit 286fb1cc32b11c18da3573a8c8c37a4f9da16e30 upstream.

Some of the macros defined in kvm_arm.h are useful in assembly files, but are
not compatible with the assembler. Change any C language integer constant
definitions using appended U, UL, or ULL to the UL() preprocessor macro. Also,
add a preprocessor include of the asm/memory.h file which defines the UL()
macro.

Fixes build errors like these when using kvm_arm.h in assembly
source files:

Error: unexpected characters following instruction at operand 3 -- `and x0,x1,#((1U<<25)-1)'

Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Geoff Levand <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/include/asm/kvm_arm.h | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)

--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -18,6 +18,7 @@
#ifndef __ARM64_KVM_ARM_H__
#define __ARM64_KVM_ARM_H__

+#include <asm/memory.h>
#include <asm/types.h>

/* Hyp Configuration Register (HCR) bits */
@@ -162,9 +163,9 @@
#endif

#define VTTBR_BADDR_SHIFT (VTTBR_X - 1)
-#define VTTBR_BADDR_MASK (((1LLU << (PHYS_MASK_SHIFT - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
-#define VTTBR_VMID_SHIFT (48LLU)
-#define VTTBR_VMID_MASK (0xffLLU << VTTBR_VMID_SHIFT)
+#define VTTBR_BADDR_MASK (((UL(1) << (PHYS_MASK_SHIFT - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
+#define VTTBR_VMID_SHIFT (UL(48))
+#define VTTBR_VMID_MASK (UL(0xFF) << VTTBR_VMID_SHIFT)

/* Hyp System Trap Register */
#define HSTR_EL2_TTEE (1 << 16)
@@ -187,13 +188,13 @@

/* Exception Syndrome Register (ESR) bits */
#define ESR_EL2_EC_SHIFT (26)
-#define ESR_EL2_EC (0x3fU << ESR_EL2_EC_SHIFT)
-#define ESR_EL2_IL (1U << 25)
+#define ESR_EL2_EC (UL(0x3f) << ESR_EL2_EC_SHIFT)
+#define ESR_EL2_IL (UL(1) << 25)
#define ESR_EL2_ISS (ESR_EL2_IL - 1)
#define ESR_EL2_ISV_SHIFT (24)
-#define ESR_EL2_ISV (1U << ESR_EL2_ISV_SHIFT)
+#define ESR_EL2_ISV (UL(1) << ESR_EL2_ISV_SHIFT)
#define ESR_EL2_SAS_SHIFT (22)
-#define ESR_EL2_SAS (3U << ESR_EL2_SAS_SHIFT)
+#define ESR_EL2_SAS (UL(3) << ESR_EL2_SAS_SHIFT)
#define ESR_EL2_SSE (1 << 21)
#define ESR_EL2_SRT_SHIFT (16)
#define ESR_EL2_SRT_MASK (0x1f << ESR_EL2_SRT_SHIFT)
@@ -207,16 +208,16 @@
#define ESR_EL2_FSC_TYPE (0x3c)

#define ESR_EL2_CV_SHIFT (24)
-#define ESR_EL2_CV (1U << ESR_EL2_CV_SHIFT)
+#define ESR_EL2_CV (UL(1) << ESR_EL2_CV_SHIFT)
#define ESR_EL2_COND_SHIFT (20)
-#define ESR_EL2_COND (0xfU << ESR_EL2_COND_SHIFT)
+#define ESR_EL2_COND (UL(0xf) << ESR_EL2_COND_SHIFT)


#define FSC_FAULT (0x04)
#define FSC_PERM (0x0c)

/* Hyp Prefetch Fault Address Register (HPFAR/HDFAR) */
-#define HPFAR_MASK (~0xFUL)
+#define HPFAR_MASK (~UL(0xf))

#define ESR_EL2_EC_UNKNOWN (0x00)
#define ESR_EL2_EC_WFI (0x01)

2015-07-01 18:43:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 29/34] arm/arm64: kvm: drop inappropriate use of kvm_is_mmio_pfn()

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ard Biesheuvel <[email protected]>

commit 07a9748c78cfc39b54f06125a216b67b9c8f09ed upstream.

Instead of using kvm_is_mmio_pfn() to decide whether a host region
should be stage 2 mapped with device attributes, add a new static
function kvm_is_device_pfn() that disregards RAM pages with the
reserved bit set, as those should usually not be mapped as device
memory.

Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/mmu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -755,6 +755,11 @@ static bool kvm_is_write_fault(struct kv
return kvm_vcpu_dabt_iswrite(vcpu);
}

+static bool kvm_is_device_pfn(unsigned long pfn)
+{
+ return !pfn_valid(pfn);
+}
+
static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct kvm_memory_slot *memslot,
unsigned long fault_status)
@@ -825,7 +830,7 @@ static int user_mem_abort(struct kvm_vcp
if (is_error_pfn(pfn))
return -EFAULT;

- if (kvm_is_mmio_pfn(pfn))
+ if (kvm_is_device_pfn(pfn))
mem_type = PAGE_S2_DEVICE;

spin_lock(&kvm->mmu_lock);

2015-07-01 18:43:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 30/34] arm/arm64: KVM: Dont clear the VCPU_POWER_OFF flag

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit 03f1d4c17edb31b41b14ca3a749ae38d2dd6639d upstream.

If a VCPU was originally started with power off (typically to be brought
up by PSCI in SMP configurations), there is no need to clear the
POWER_OFF flag in the kernel, as this flag is only tested during the
init ioctl itself.

Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/arm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -678,7 +678,7 @@ static int kvm_arch_vcpu_ioctl_vcpu_init
/*
* Handle the "start in power-off" case by marking the VCPU as paused.
*/
- if (__test_and_clear_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
+ if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
vcpu->arch.pause = true;

return 0;

2015-07-01 18:43:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 31/34] arm/arm64: KVM: Correct KVM_ARM_VCPU_INIT power off option

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit 3ad8b3de526a76fbe9466b366059e4958957b88f upstream.

The implementation of KVM_ARM_VCPU_INIT is currently not doing what
userspace expects, namely making sure that a vcpu which may have been
turned off using PSCI is returned to its initial state, which would be
powered on if userspace does not set the KVM_ARM_VCPU_POWER_OFF flag.

Implement the expected functionality and clarify the ABI.

Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
Documentation/virtual/kvm/api.txt | 3 ++-
arch/arm/kvm/arm.c | 2 ++
2 files changed, 4 insertions(+), 1 deletion(-)

--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2344,7 +2344,8 @@ should be created before this ioctl is i

Possible features:
- KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
- Depends on KVM_CAP_ARM_PSCI.
+ Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on
+ and execute guest code when KVM_RUN is called.
- KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode.
Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only).

--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -680,6 +680,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init
*/
if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
vcpu->arch.pause = true;
+ else
+ vcpu->arch.pause = false;

return 0;
}

2015-07-01 18:43:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 32/34] arm/arm64: KVM: Reset the HCR on each vcpu when resetting the vcpu

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit b856a59141b1066d3c896a0d0231f84dabd040af upstream.

When userspace resets the vcpu using KVM_ARM_VCPU_INIT, we should also
reset the HCR, because we now modify the HCR dynamically to
enable/disable trapping of guest accesses to the VM registers.

This is crucial for reboot of VMs working since otherwise we will not be
doing the necessary cache maintenance operations when faulting in pages
with the guest MMU off.

Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/include/asm/kvm_emulate.h | 5 +++++
arch/arm/kvm/arm.c | 2 ++
arch/arm/kvm/guest.c | 1 -
arch/arm64/include/asm/kvm_emulate.h | 5 +++++
arch/arm64/kvm/guest.c | 1 -
5 files changed, 12 insertions(+), 2 deletions(-)

--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -33,6 +33,11 @@ void kvm_inject_undefined(struct kvm_vcp
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);

+static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.hcr = HCR_GUEST_MASK;
+}
+
static inline bool vcpu_mode_is_32bit(struct kvm_vcpu *vcpu)
{
return 1;
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -675,6 +675,8 @@ static int kvm_arch_vcpu_ioctl_vcpu_init
if (ret)
return ret;

+ vcpu_reset_hcr(vcpu);
+
/*
* Handle the "start in power-off" case by marking the VCPU as paused.
*/
--- a/arch/arm/kvm/guest.c
+++ b/arch/arm/kvm/guest.c
@@ -38,7 +38,6 @@ struct kvm_stats_debugfs_item debugfs_en

int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
{
- vcpu->arch.hcr = HCR_GUEST_MASK;
return 0;
}

--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -38,6 +38,11 @@ void kvm_inject_undefined(struct kvm_vcp
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);

+static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
+{
+ vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
+}
+
static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
{
return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -38,7 +38,6 @@ struct kvm_stats_debugfs_item debugfs_en

int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu)
{
- vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
return 0;
}


2015-07-01 18:43:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 33/34] arm/arm64: KVM: Introduce stage2_unmap_vm

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit 957db105c99792ae8ef61ffc9ae77d910f6471da upstream.

Introduce a new function to unmap user RAM regions in the stage2 page
tables. This is needed on reboot (or when the guest turns off the MMU)
to ensure we fault in pages again and make the dcache, RAM, and icache
coherent.

Using unmap_stage2_range for the whole guest physical range does not
work, because that unmaps IO regions (such as the GIC) which will not be
recreated or in the best case faulted in on a page-by-page basis.

Call this function on secondary and subsequent calls to the
KVM_ARM_VCPU_INIT ioctl so that a reset VCPU will detect the guest
Stage-1 MMU is off when faulting in pages and make the caches coherent.

Acked-by: Marc Zyngier <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/include/asm/kvm_mmu.h | 1
arch/arm/kvm/arm.c | 7 ++++
arch/arm/kvm/mmu.c | 65 +++++++++++++++++++++++++++++++++++++++
arch/arm64/include/asm/kvm_mmu.h | 1
4 files changed, 74 insertions(+)

--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -47,6 +47,7 @@ int create_hyp_io_mappings(void *from, v
void free_boot_hyp_pgd(void);
void free_hyp_pgds(void);

+void stage2_unmap_vm(struct kvm *kvm);
int kvm_alloc_stage2_pgd(struct kvm *kvm);
void kvm_free_stage2_pgd(struct kvm *kvm);
int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -675,6 +675,13 @@ static int kvm_arch_vcpu_ioctl_vcpu_init
if (ret)
return ret;

+ /*
+ * Ensure a rebooted VM will fault in RAM pages and detect if the
+ * guest MMU is turned off and flush the caches as needed.
+ */
+ if (vcpu->arch.has_run_once)
+ stage2_unmap_vm(vcpu->kvm);
+
vcpu_reset_hcr(vcpu);

/*
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -556,6 +556,71 @@ static void unmap_stage2_range(struct kv
unmap_range(kvm, kvm->arch.pgd, start, size);
}

+static void stage2_unmap_memslot(struct kvm *kvm,
+ struct kvm_memory_slot *memslot)
+{
+ hva_t hva = memslot->userspace_addr;
+ phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
+ phys_addr_t size = PAGE_SIZE * memslot->npages;
+ hva_t reg_end = hva + size;
+
+ /*
+ * A memory region could potentially cover multiple VMAs, and any holes
+ * between them, so iterate over all of them to find out if we should
+ * unmap any of them.
+ *
+ * +--------------------------------------------+
+ * +---------------+----------------+ +----------------+
+ * | : VMA 1 | VMA 2 | | VMA 3 : |
+ * +---------------+----------------+ +----------------+
+ * | memory region |
+ * +--------------------------------------------+
+ */
+ do {
+ struct vm_area_struct *vma = find_vma(current->mm, hva);
+ hva_t vm_start, vm_end;
+
+ if (!vma || vma->vm_start >= reg_end)
+ break;
+
+ /*
+ * Take the intersection of this VMA with the memory region
+ */
+ vm_start = max(hva, vma->vm_start);
+ vm_end = min(reg_end, vma->vm_end);
+
+ if (!(vma->vm_flags & VM_PFNMAP)) {
+ gpa_t gpa = addr + (vm_start - memslot->userspace_addr);
+ unmap_stage2_range(kvm, gpa, vm_end - vm_start);
+ }
+ hva = vm_end;
+ } while (hva < reg_end);
+}
+
+/**
+ * stage2_unmap_vm - Unmap Stage-2 RAM mappings
+ * @kvm: The struct kvm pointer
+ *
+ * Go through the memregions and unmap any reguler RAM
+ * backing memory already mapped to the VM.
+ */
+void stage2_unmap_vm(struct kvm *kvm)
+{
+ struct kvm_memslots *slots;
+ struct kvm_memory_slot *memslot;
+ int idx;
+
+ idx = srcu_read_lock(&kvm->srcu);
+ spin_lock(&kvm->mmu_lock);
+
+ slots = kvm_memslots(kvm);
+ kvm_for_each_memslot(memslot, slots)
+ stage2_unmap_memslot(kvm, memslot);
+
+ spin_unlock(&kvm->mmu_lock);
+ srcu_read_unlock(&kvm->srcu, idx);
+}
+
/**
* kvm_free_stage2_pgd - free all stage-2 tables
* @kvm: The KVM struct pointer for the VM.
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -74,6 +74,7 @@ int create_hyp_io_mappings(void *from, v
void free_boot_hyp_pgd(void);
void free_hyp_pgds(void);

+void stage2_unmap_vm(struct kvm *kvm);
int kvm_alloc_stage2_pgd(struct kvm *kvm);
void kvm_free_stage2_pgd(struct kvm *kvm);
int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,

2015-07-01 18:43:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.14 34/34] arm/arm64: KVM: Dont allow creating VCPUs after vgic_initialized

3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoffer Dall <[email protected]>

commit 716139df2517fbc3f2306dbe8eba0fa88dca0189 upstream.

When the vgic initializes its internal state it does so based on the
number of VCPUs available at the time. If we allow KVM to create more
VCPUs after the VGIC has been initialized, we are likely to error out in
unfortunate ways later, perform buffer overflows etc.

Acked-by: Marc Zyngier <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Shannon Zhao <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/arm.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -220,6 +220,11 @@ struct kvm_vcpu *kvm_arch_vcpu_create(st
int err;
struct kvm_vcpu *vcpu;

+ if (irqchip_in_kernel(kvm) && vgic_initialized(kvm)) {
+ err = -EBUSY;
+ goto out;
+ }
+
vcpu = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL);
if (!vcpu) {
err = -ENOMEM;

2015-07-01 22:36:06

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 3.14 00/34] 3.14.47-stable review

On 07/01/2015 12:40 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.47 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Jul 3 18:39:45 UTC 2015.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.14.47-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


--
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
[email protected] | (970) 217-8978

2015-07-02 02:20:08

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 3.14 00/34] 3.14.47-stable review

On 07/01/2015 11:40 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.47 release.
> There are 34 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Jul 3 18:39:45 UTC 2015.
> Anything received after that time might be too late.
>

Build results:
total: 120 pass: 120 fail: 0
Qemu test results:
total: 30 pass: 30 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

2015-07-02 04:30:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.14 00/34] 3.14.47-stable review

On Wed, Jul 01, 2015 at 07:19:54PM -0700, Guenter Roeck wrote:
> On 07/01/2015 11:40 AM, Greg Kroah-Hartman wrote:
> >This is the start of the stable review cycle for the 3.14.47 release.
> >There are 34 patches in this series, all will be posted as a response
> >to this one. If anyone has any issues with these being applied, please
> >let me know.
> >
> >Responses should be made by Fri Jul 3 18:39:45 UTC 2015.
> >Anything received after that time might be too late.
> >
>
> Build results:
> total: 120 pass: 120 fail: 0
> Qemu test results:
> total: 30 pass: 30 fail: 0
>
> Details are available at http://server.roeck-us.net:8010/builders.

Great, thanks for running these tests and letting me know.

greg k-h