2016-03-16 08:06:24

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 000/107] 3.4.111-rc1 review

From: Zefan Li <[email protected]>

This is the start of the stable review cycle for the 3.4.111 release.
There are 107 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Mar 16 16:04:10 CST 2016.
Anything received after that time might be too late.

A combined patch relative to 3.4.110 will be posted as an additional
response to this. A shortlog and diffstat can be found below.

thanks,

Zefan Li

--------------------

AMAN DEEP (1):
usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init()
function

Al Viro (3):
9p: don't leave a half-initialized inode sitting around
sg_start_req(): make sure that there's not too many elements in iovec
get rid of s_files and files_lock

Aleksei Mamlin (1):
libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk
VB0250EAVER

Alex Deucher (1):
drm/radeon/combios: add some validation of lvds values

Alexei Potashnik (1):
target/iscsi: Fix double free of a TUR followed by a solicited NOPOUT

Andrey Vagin (1):
netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Andy Lutomirski (2):
x86/xen: Probe target addresses in set_aliased_prot() before the
hypercall
x86/ldt: Make modify_ldt synchronous

Anssi Hannula (1):
ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest
DragonFly

Arne Fitzenreiter (2):
libata: add ATA_HORKAGE_NOTRIM
libata: force disable trim for SuperSSpeed S238

Bart Van Assche (1):
libfc: Fix fc_fcp_cleanup_each_cmd()

Ben Hutchings (2):
isdn_ppp: Add checks for allocation failure in isdn_ppp_open()
ppp, slip: Validate VJ compression slot parameters completely

Ben Zhang (1):
kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not
every one

Bernhard Bender (1):
Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen

Brian Campbell (1):
xhci: Calculate old endpoints correctly on device reset

Chris Metcalf (1):
tile: use free_bootmem_late() for initrd

Claudio Cappelli (1):
USB: option: add 2020:4000 ID

Clemens Ladisch (2):
ALSA: tlv: compute TLV_*_ITEM lengths automatically
ALSA: tlv: add DECLARE_TLV_DB_RANGE()

Dan Carpenter (1):
rds: fix an integer overflow test in rds_info_getsockopt()

Daniel Borkmann (1):
rtnetlink: verify IFLA_VF_INFO attributes before passing them to
driver

David Ahern (1):
net: Fix RCU splat in af_key

David Daney (1):
MIPS: Make set_pte() SMP safe.

David Howells (2):
KEYS: Fix race between key destruction and finding a keyring by name
KEYS: Fix crash when attempt to garbage collect an uninstantiated
keyring

Dennis Yang (1):
dm btree remove: fix bug in redistribute3

Dirk Behme (1):
USB: sierra: add 1199:68AB device ID

Dominic Sacré (1):
ALSA: usb-audio: Add MIDI support for Steinberg MI2/MI4

Edward Hyunkoo Jee (1):
inet: frags: fix defragmented packet's IP header for af_packet

Eric Northup (1):
KVM: x86: work around infinite loop in microcode when #AC is delivered

Felix Fietkau (1):
MIPS: Fix sched_getaffinity with MT FPAFF enabled

Filipe Manana (1):
Btrfs: use kmem_cache_free when freeing entry in inode cache

Gioh Kim (1):
fs/buffer.c: support buffer cache allocations with gfp modifiers

Hannes Frederic Sowa (3):
net: add validation for the socket syscall protocol argument
ipv6: probe routes asynchronous in rt6_probe
net: fix warnings in 'make htmldocs' by moving macro definition out of
field declaration

Heiko Carstens (1):
s390/process: fix sfpc inline assembly

Herbert Xu (2):
net: Clone skb before setting peeked flag
crypto: ixp4xx - Remove bogus BUG_ON on scattered dst buffer

Herton R. Krzesinski (1):
ipc,sem: fix use after free on IPC_RMID after a task using same
semaphore set exits

Jan Beulich (1):
x86/LDT: Print the real LDT base address

Jason Wang (1):
virtio-net: drop NETIF_F_FRAGLIST

Jiri Pirko (1):
niu: don't count tx error twice in case of headroom realloc fails

Joe Perches (1):
hpfs: hpfs_error: Remove static buffer, use vsprintf extension %pV
instead

Joe Stringer (1):
netfilter: nf_conntrack: Support expectations in different zones

Joe Thornber (4):
dm thin: allocate the cell_sort_array dynamically
dm btree: silence lockdep lock inversion in dm_btree_del()
dm btree: add ref counting ops for the leaves of top level btrees
dm btree remove: fix a bug when rebalancing nodes after removal

Johan Hovold (1):
USB: whiteheat: fix potential null-deref at probe

John Soni Jose (1):
libiscsi: Fix host busy blocking during connection teardown

John Youn (2):
usb: dwc3: Reset the transfer resource index on SET_INTERFACE
usb: dwc3: Fix assignment of EP transfer resources

Joseph Qi (1):
ocfs2: fix BUG in ocfs2_downconvert_thread_do_work()

Juergen Gross (2):
x86/ldt: Correct LDT access in single stepping logic
x86/ldt: Correct FPU emulation access to LDT

Julian Anastasov (2):
net: do not process device backlog during unregistration
net: call rcu_read_lock early in process_backlog

Kirill A. Shutemov (1):
mm: avoid setting up anonymous pages into file mapping

Linus Torvalds (1):
Initialize msg/shm IPC objects before doing ipc_addid()

Lior Amsalem (1):
ata: pmp: add quirk for Marvell 4140 SATA PMP

Marc-André Lureau (1):
vhost: actually track log eventfd file

Marcelo Leitner (1):
ipv6: addrconf: validate new MTU before applying it

Marcelo Tosatti (1):
KVM: x86: move steal time initialization to vcpu entry time

Martin Schwidefsky (1):
s390/sclp: clear upper register halves in _sclp_print_early

Mathias Nyman (1):
xhci: fix off by one error in TRB DMA address boundary check

Michael Walle (1):
EDAC, ppc4xx: Access mci->csrows array elements properly

Michal Hocko (1):
ext4: replace open coded nofail allocation in ext4_free_blocks()

Michal Kubeček (1):
ipv6: prevent fib6_run_gc() contention

Mikulas Patocka (1):
libata: increase the timeout when setting transfer mode

Neil Brown (1):
SUNRPC: never enqueue a ->rq_cong request on ->sending

NeilBrown (4):
md: make sure everything is freed when dm-raid stops an array.
md: flush ->event_work before stopping array.
md/raid1: fix test for 'was read error from last working device'.
md/raid1: extend spinlock to protect raid1_end_read_request against
inconsistencies

Nicholas Bellinger (1):
iscsi-target: Fix use-after-free during TPG session shutdown

Nikolay Borisov (2):
bufferhead: Add _gfp version for sb_getblk()
ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp

Oliver Neukum (1):
usb-storage: ignore ZTE MF 823 card reader in mode 0x1225

Paolo Bonzini (1):
KVM: svm: unconditionally intercept #DB

Peter Sanford (1):
USB: cp210x: add ID for Aruba Networks controllers

Peter Zijlstra (1):
perf: Fix fasync handling on inherited events

Quentin Casasnovas (1):
RDS: fix race condition when sending a message on unbound socket

Rainer Weikusat (2):
unix: avoid use-after-free in ep_remove_wait_queue
af_unix: Guard against other == sk in unix_dgram_sendmsg

Richard Stearn (1):
NET: AX.25: Stop heartbeat timer on disconnect.

Richard Weinberger (1):
localmodconfig: Use Kbuild files too

Sanidhya Kashyap (1):
hpfs: kstrdup() out of memory handling

Sasha Levin (2):
RDS: verify the underlying transport exists before creating a
connection
atm: deal with setting entry before mkip was called

Seymour, Shane M (1):
st: null pointer dereference panic caused by use after kref_put by
st_open

Stefan Agner (1):
can: mcp251x: fix resume when device is down

Tom Hughes (1):
mac80211: clear subdir_stations when removing debugfs

Tomas Winkler (1):
mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()

WANG Cong (1):
pptp: verify sockaddr_len in pptp_bind() and pptp_connect()

Wengang Wang (1):
rds: rds_ib_device.refcount overflow

Yao-Wen Mao (1):
ALSA: usb-audio: add dB range mapping for some devices

Zefan Li (1):
Revert "usb: dwc3: Reset the transfer resource index on SET_INTERFACE"

Zhao Junwang (1):
drm: add a check for x/y in drm_mode_setcrtc

Zhuang Jin Can (2):
xhci: report U3 when link is in resume state
xhci: prevent bus_suspend if SS port resuming in phase 1

[email protected] (1):
net: avoid to hang up on sending due to sysctl configuration overflow.

lucien (1):
sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state

arch/mips/include/asm/pgtable.h | 31 +++
arch/mips/kernel/mips-mt-fpaff.c | 5 +-
arch/s390/kernel/process.c | 2 +-
arch/s390/kernel/sclp.S | 4 +
arch/tile/kernel/setup.c | 2 +-
arch/x86/include/asm/desc.h | 15 --
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/include/asm/mmu.h | 3 +-
arch/x86/include/asm/mmu_context.h | 49 ++++-
arch/x86/kernel/cpu/common.c | 4 +-
arch/x86/kernel/ldt.c | 267 ++++++++++++++-----------
arch/x86/kernel/process_64.c | 6 +-
arch/x86/kernel/step.c | 8 +-
arch/x86/kvm/svm.c | 22 +-
arch/x86/kvm/trace.h | 1 +
arch/x86/kvm/vmx.c | 5 +-
arch/x86/kvm/x86.c | 9 +-
arch/x86/math-emu/fpu_entry.c | 3 +-
arch/x86/math-emu/fpu_system.h | 21 +-
arch/x86/math-emu/get_address.c | 3 +-
arch/x86/power/cpu.c | 3 +-
arch/x86/xen/enlighten.c | 40 ++++
drivers/ata/libata-core.c | 9 +-
drivers/ata/libata-pmp.c | 7 +
drivers/ata/libata-scsi.c | 3 +-
drivers/crypto/ixp4xx_crypto.c | 1 -
drivers/edac/ppc4xx_edac.c | 2 +-
drivers/gpu/drm/drm_crtc.c | 7 +-
drivers/gpu/drm/radeon/radeon_combios.c | 7 +-
drivers/input/touchscreen/usbtouchscreen.c | 3 +
drivers/isdn/i4l/isdn_ppp.c | 12 +-
drivers/md/dm-thin.c | 14 +-
drivers/md/md.c | 18 +-
drivers/md/persistent-data/dm-btree-internal.h | 6 +
drivers/md/persistent-data/dm-btree-remove.c | 35 ++--
drivers/md/persistent-data/dm-btree-spine.c | 37 ++++
drivers/md/persistent-data/dm-btree.c | 9 +-
drivers/md/raid1.c | 9 +-
drivers/mmc/card/block.c | 2 +
drivers/net/can/mcp251x.c | 15 +-
drivers/net/ethernet/sun/niu.c | 4 +-
drivers/net/ppp/ppp_generic.c | 6 +-
drivers/net/ppp/pptp.c | 6 +
drivers/net/slip/slhc.c | 12 +-
drivers/net/slip/slip.c | 2 +-
drivers/net/virtio_net.c | 4 +-
drivers/scsi/libfc/fc_fcp.c | 19 +-
drivers/scsi/libiscsi.c | 25 +--
drivers/scsi/sg.c | 3 +
drivers/scsi/st.c | 2 +-
drivers/target/iscsi/iscsi_target.c | 14 +-
drivers/usb/dwc3/core.h | 1 -
drivers/usb/dwc3/ep0.c | 5 -
drivers/usb/dwc3/gadget.c | 70 +++++--
drivers/usb/host/xhci-hub.c | 13 +-
drivers/usb/host/xhci-mem.c | 2 +-
drivers/usb/host/xhci-ring.c | 5 +-
drivers/usb/host/xhci.c | 3 +
drivers/usb/host/xhci.h | 1 +
drivers/usb/serial/cp210x.c | 1 +
drivers/usb/serial/option.c | 1 +
drivers/usb/serial/sierra.c | 1 +
drivers/usb/serial/whiteheat.c | 31 +++
drivers/usb/storage/unusual_devs.h | 12 ++
drivers/vhost/vhost.c | 1 +
fs/9p/vfs_inode.c | 3 +-
fs/9p/vfs_inode_dotl.c | 3 +-
fs/btrfs/inode-map.c | 2 +-
fs/buffer.c | 43 ++--
fs/ext4/extents.c | 7 +-
fs/ext4/mballoc.c | 16 +-
fs/file_table.c | 130 ------------
fs/hpfs/super.c | 19 +-
fs/internal.h | 3 -
fs/ocfs2/dlmglue.c | 10 +-
fs/open.c | 2 -
fs/super.c | 21 +-
include/linux/buffer_head.h | 54 ++++-
include/linux/fs.h | 13 --
include/linux/libata.h | 2 +
include/net/af_unix.h | 1 +
include/net/ip6_fib.h | 2 +-
include/net/sock.h | 1 +
include/sound/tlv.h | 24 ++-
ipc/msg.c | 18 +-
ipc/sem.c | 23 ++-
ipc/shm.c | 13 +-
ipc/util.c | 8 +-
kernel/events/core.c | 12 +-
kernel/watchdog.c | 8 +
mm/memory.c | 17 +-
net/atm/clip.c | 3 +
net/ax25/af_ax25.c | 3 +
net/ax25/ax25_subr.c | 1 +
net/core/datagram.c | 41 +++-
net/core/dev.c | 26 +--
net/core/rtnetlink.c | 106 +++++-----
net/core/sysctl_net_core.c | 14 +-
net/decnet/af_decnet.c | 3 +
net/ipv4/af_inet.c | 3 +
net/ipv4/ip_fragment.c | 6 +-
net/ipv4/sysctl_net_ipv4.c | 11 +-
net/ipv6/addrconf.c | 17 +-
net/ipv6/af_inet6.c | 3 +
net/ipv6/ip6_fib.c | 19 +-
net/ipv6/ndisc.c | 4 +-
net/ipv6/route.c | 41 +++-
net/irda/af_irda.c | 3 +
net/key/af_key.c | 46 ++---
net/mac80211/debugfs_netdev.c | 1 +
net/netfilter/nf_conntrack_core.c | 21 +-
net/netfilter/nf_conntrack_expect.c | 3 +-
net/rds/ib_rdma.c | 4 +-
net/rds/info.c | 2 +-
net/rds/send.c | 4 +-
net/sctp/sm_sideeffect.c | 2 +-
net/sunrpc/xprt.c | 3 +
net/unix/af_unix.c | 188 +++++++++++++++--
scripts/kconfig/streamline_config.pl | 2 +-
security/keys/gc.c | 10 +-
sound/usb/mixer.c | 2 +
sound/usb/mixer_maps.c | 12 ++
sound/usb/mixer_quirks.c | 37 ++++
sound/usb/mixer_quirks.h | 4 +
sound/usb/quirks-table.h | 68 +++++++
125 files changed, 1390 insertions(+), 722 deletions(-)

--
1.9.1


2016-03-16 08:06:57

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 001/107] Btrfs: use kmem_cache_free when freeing entry in inode cache

From: Filipe Manana <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit c3f4a1685bb87e59c886ee68f7967eae07d4dffa upstream.

The free space entries are allocated using kmem_cache_zalloc(),
through __btrfs_add_free_space(), therefore we should use
kmem_cache_free() and not kfree() to avoid any confusion and
any potential problem. Looking at the kfree() definition at
mm/slab.c it has the following comment:

/*
* (...)
*
* Don't free memory not originally allocated by kmalloc()
* or you will run into trouble.
*/

So better be safe and use kmem_cache_free().

Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
fs/btrfs/inode-map.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index b1a1c92..47abfd2 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -283,7 +283,7 @@ void btrfs_unpin_free_ino(struct btrfs_root *root)
__btrfs_add_free_space(ctl, info->offset, count);
free:
rb_erase(&info->offset_index, rbroot);
- kfree(info);
+ kmem_cache_free(btrfs_free_space_cachep, info);
}
}

--
1.9.1

2016-03-16 08:07:05

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 003/107] bufferhead: Add _gfp version for sb_getblk()

From: Nikolay Borisov <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit bd7ade3cd9b0850264306f5c2b79024a417b6396 upstream.

sb_getblk() is used during ext4 (and possibly other FSes) writeback
paths. Sometimes such path require allocating memory and guaranteeing
that such allocation won't block. Currently, however, there is no way
to provide user flags for sb_getblk which could lead to deadlocks.

This patch implements a sb_getblk_gfp with the only difference it can
accept user-provided GFP flags.

Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
include/linux/buffer_head.h | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 1738420..fed3f3a 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -308,6 +308,13 @@ sb_getblk(struct super_block *sb, sector_t block)
return __getblk_gfp(sb->s_bdev, block, sb->s_blocksize, __GFP_MOVABLE);
}

+
+static inline struct buffer_head *
+sb_getblk_gfp(struct super_block *sb, sector_t block, gfp_t gfp)
+{
+ return __getblk_gfp(sb->s_bdev, block, sb->s_blocksize, gfp);
+}
+
static inline struct buffer_head *
sb_find_get_block(struct super_block *sb, sector_t block)
{
--
1.9.1

2016-03-16 08:07:18

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 005/107] ext4: replace open coded nofail allocation in ext4_free_blocks()

From: Michal Hocko <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 7444a072c387a93ebee7066e8aee776954ab0e41 upstream.

ext4_free_blocks is looping around the allocation request and mimics
__GFP_NOFAIL behavior without any allocation fallback strategy. Let's
remove the open coded loop and replace it with __GFP_NOFAIL. Without the
flag the allocator has no way to find out never-fail requirement and
cannot help in any way.

Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
fs/ext4/mballoc.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index cdfc763..46e6562 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4643,18 +4643,12 @@ do_more:
/*
* blocks being freed are metadata. these blocks shouldn't
* be used until this transaction is committed
+ *
+ * We use __GFP_NOFAIL because ext4_free_blocks() is not allowed
+ * to fail.
*/
- retry:
- new_entry = kmem_cache_alloc(ext4_free_data_cachep, GFP_NOFS);
- if (!new_entry) {
- /*
- * We use a retry loop because
- * ext4_free_blocks() is not allowed to fail.
- */
- cond_resched();
- congestion_wait(BLK_RW_ASYNC, HZ/50);
- goto retry;
- }
+ new_entry = kmem_cache_alloc(ext4_free_data_cachep,
+ GFP_NOFS|__GFP_NOFAIL);
new_entry->efd_start_cluster = bit;
new_entry->efd_group = block_group;
new_entry->efd_count = count_clusters;
--
1.9.1

2016-03-16 08:07:32

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 006/107] mm: avoid setting up anonymous pages into file mapping

From: "Kirill A. Shutemov" <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 6b7339f4c31ad69c8e9c0b2859276e22cf72176d upstream.

Reading page fault handler code I've noticed that under right
circumstances kernel would map anonymous pages into file mappings: if
the VMA doesn't have vm_ops->fault() and the VMA wasn't fully populated
on ->mmap(), kernel would handle page fault to not populated pte with
do_anonymous_page().

Let's change page fault handler to use do_anonymous_page() only on
anonymous VMA (->vm_ops == NULL) and make sure that the VMA is not
shared.

For file mappings without vm_ops->fault() or shred VMA without vm_ops,
page fault on pte_none() entry would lead to SIGBUS.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Willy Tarreau <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
mm/memory.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 02aef93..4774579 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3173,6 +3173,14 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,

pte_unmap(page_table);

+ /* File mapping without ->vm_ops ? */
+ if (vma->vm_flags & VM_SHARED)
+ return VM_FAULT_SIGBUS;
+
+ /* File mapping without ->vm_ops ? */
+ if (vma->vm_flags & VM_SHARED)
+ return VM_FAULT_SIGBUS;
+
/* Check if we need to add a guard page to the stack */
if (check_stack_guard_page(vma, address) < 0)
return VM_FAULT_SIGSEGV;
@@ -3432,6 +3440,9 @@ static int do_linear_fault(struct mm_struct *mm, struct vm_area_struct *vma,
- vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;

pte_unmap(page_table);
+ /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */
+ if (!vma->vm_ops->fault)
+ return VM_FAULT_SIGBUS;
return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
}

@@ -3490,11 +3501,9 @@ int handle_pte_fault(struct mm_struct *mm,
entry = *pte;
if (!pte_present(entry)) {
if (pte_none(entry)) {
- if (vma->vm_ops) {
- if (likely(vma->vm_ops->fault))
- return do_linear_fault(mm, vma, address,
+ if (vma->vm_ops)
+ return do_linear_fault(mm, vma, address,
pte, pmd, flags, entry);
- }
return do_anonymous_page(mm, vma, address,
pte, pmd, flags);
}
--
1.9.1

2016-03-16 08:07:41

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 008/107] hpfs: hpfs_error: Remove static buffer, use vsprintf extension %pV instead

From: Joe Perches <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit a28e4b2b18ccb90df402da3f21e1a83c9d4f8ec1 upstream.

Removing unnecessary static buffers is good.
Use the vsprintf %pV extension instead.

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[Mikulas:
- The bug corrected by the patch is - if hpfs_error is called concurrently
on multiple filesystems, it could corrupt the string because the text
buffer is shared. That's why I marked the patch for stable.]
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
fs/hpfs/super.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index 0b990b2..efc1823 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -52,17 +52,20 @@ static void unmark_dirty(struct super_block *s)
}

/* Filesystem error... */
-static char err_buf[1024];
-
void hpfs_error(struct super_block *s, const char *fmt, ...)
{
+ struct va_format vaf;
va_list args;

va_start(args, fmt);
- vsnprintf(err_buf, sizeof(err_buf), fmt, args);
+
+ vaf.fmt = fmt;
+ vaf.va = &args;
+
+ pr_err("filesystem error: %pV", &vaf);
+
va_end(args);

- printk("HPFS: filesystem error: %s", err_buf);
if (!hpfs_sb(s)->sb_was_error) {
if (hpfs_sb(s)->sb_err == 2) {
printk("; crashing the system because you wanted it\n");
--
1.9.1

2016-03-16 08:07:46

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 012/107] dm thin: allocate the cell_sort_array dynamically

From: Joe Thornber <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit a822c83e47d97cdef38c4352e1ef62d9f46cfe98 upstream.

Given the pool's cell_sort_array holds 8192 pointers it triggers an
order 5 allocation via kmalloc. This order 5 allocation is prone to
failure as system memory gets more fragmented over time.

Fix this by allocating the cell_sort_array using vmalloc.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
[lizf: Backported 3.4: it's prinson_{create,destroy}() that need fixing]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/dm-thin.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index e811e44..862612e 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -13,6 +13,7 @@
#include <linux/init.h>
#include <linux/module.h>
#include <linux/slab.h>
+#include <linux/vmalloc.h>

#define DM_MSG_PREFIX "thin"

@@ -149,9 +150,7 @@ static struct bio_prison *prison_create(unsigned nr_cells)
{
unsigned i;
uint32_t nr_buckets = calc_nr_buckets(nr_cells);
- size_t len = sizeof(struct bio_prison) +
- (sizeof(struct hlist_head) * nr_buckets);
- struct bio_prison *prison = kmalloc(len, GFP_KERNEL);
+ struct bio_prison *prison = kmalloc(sizeof(*prison), GFP_KERNEL);

if (!prison)
return NULL;
@@ -164,9 +163,15 @@ static struct bio_prison *prison_create(unsigned nr_cells)
return NULL;
}

+ prison->cells = vmalloc(sizeof(*prison->cells) * nr_buckets);
+ if (!prison->cells) {
+ mempool_destroy(prison->cell_pool);
+ kfree(prison);
+ return NULL;
+ }
+
prison->nr_buckets = nr_buckets;
prison->hash_mask = nr_buckets - 1;
- prison->cells = (struct hlist_head *) (prison + 1);
for (i = 0; i < nr_buckets; i++)
INIT_HLIST_HEAD(prison->cells + i);

@@ -175,6 +180,7 @@ static struct bio_prison *prison_create(unsigned nr_cells)

static void prison_destroy(struct bio_prison *prison)
{
+ vfree(prison->cells);
mempool_destroy(prison->cell_pool);
kfree(prison);
}
--
1.9.1

2016-03-16 08:07:53

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 015/107] dm btree: silence lockdep lock inversion in dm_btree_del()

From: Joe Thornber <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 1c7518794a3647eb345d59ee52844e8a40405198 upstream.

Allocate memory using GFP_NOIO when deleting a btree. dm_btree_del()
can be called via an ioctl and we don't want to recurse into the FS or
block layer.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/persistent-data/dm-btree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/persistent-data/dm-btree.c b/drivers/md/persistent-data/dm-btree.c
index 371f3d4..d05cf15 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -235,7 +235,7 @@ int dm_btree_del(struct dm_btree_info *info, dm_block_t root)
int r;
struct del_stack *s;

- s = kmalloc(sizeof(*s), GFP_KERNEL);
+ s = kmalloc(sizeof(*s), GFP_NOIO);
if (!s)
return -ENOMEM;
s->tm = info->tm;
--
1.9.1

2016-03-16 08:08:09

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 017/107] drm: add a check for x/y in drm_mode_setcrtc

From: Zhao Junwang <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 01447e9f04ba1c49a9534ae6a5a6f26c2bb05226 upstream.

legacy setcrtc ioctl does take a 32 bit value which might indeed
overflow

the checks of crtc_req->x > INT_MAX and crtc_req->y > INT_MAX aren't
needed any more with this

v2: -polish the annotation according to Daniel's comment

Cc: Daniel Vetter <[email protected]>
Signed-off-by: Zhao Junwang <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/gpu/drm/drm_crtc.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index c61e672..ed4b748 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -1836,8 +1836,11 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EINVAL;

- /* For some reason crtc x/y offsets are signed internally. */
- if (crtc_req->x > INT_MAX || crtc_req->y > INT_MAX)
+ /*
+ * Universal plane src offsets are only 16.16, prevent havoc for
+ * drivers using universal plane code internally.
+ */
+ if (crtc_req->x & 0xffff0000 || crtc_req->y & 0xffff0000)
return -ERANGE;

mutex_lock(&dev->mode_config.mutex);
--
1.9.1

2016-03-16 08:08:22

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 019/107] net: do not process device backlog during unregistration

From: Julian Anastasov <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit e9e4dd3267d0c5234c5c0f47440456b10875dec9 upstream.

commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
fixes a problem where processed packet comes from device
with destroyed inetdev (dev->ip_ptr). This is not expected
because inetdev_destroy is called in NETDEV_UNREGISTER
phase and packets should not be processed after
dev_close_many() and synchronize_net(). Above fix is still
required because inetdev_destroy can be called for other
reasons. But it shows the real problem: backlog can keep
packets for long time and they do not hold reference to
device. Such packets are then delivered to upper levels
at the same time when device is unregistered.
Calling flush_backlog after NETDEV_UNREGISTER_FINAL still
accounts all packets from backlog but before that some packets
continue to be delivered to upper levels long after the
synchronize_net call which is supposed to wait the last
ones. Also, as Eric pointed out, processed packets, mostly
from other devices, can continue to add new packets to backlog.

Fix the problem by moving flush_backlog early, after the
device driver is stopped and before the synchronize_net() call.
Then use netif_running check to make sure we do not add more
packets to backlog. We have to do it in enqueue_to_backlog
context when the local IRQ is disabled. As result, after the
flush_backlog and synchronize_net sequence all packets
should be accounted.

Thanks to Eric W. Biederman for the test script and his
valuable feedback!

Reported-by: Vittorio Gambaletta <[email protected]>
Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: Julian Anastasov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/core/dev.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 9014952..1e363d0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2880,6 +2880,8 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
local_irq_save(flags);

rps_lock(sd);
+ if (!netif_running(skb->dev))
+ goto drop;
if (skb_queue_len(&sd->input_pkt_queue) <= netdev_max_backlog) {
if (skb_queue_len(&sd->input_pkt_queue)) {
enqueue:
@@ -2900,6 +2902,7 @@ enqueue:
goto enqueue;
}

+drop:
sd->dropped++;
rps_unlock(sd);

@@ -5228,6 +5231,7 @@ static void rollback_registered_many(struct list_head *head)
unlist_netdevice(dev);

dev->reg_state = NETREG_UNREGISTERING;
+ on_each_cpu(flush_backlog, dev, 1);
}

synchronize_net();
@@ -5791,8 +5795,6 @@ void netdev_run_todo(void)

dev->reg_state = NETREG_UNREGISTERED;

- on_each_cpu(flush_backlog, dev, 1);
-
netdev_wait_allrefs(dev);

/* paranoia */
--
1.9.1

2016-03-16 08:08:36

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 020/107] net: call rcu_read_lock early in process_backlog

From: Julian Anastasov <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 2c17d27c36dcce2b6bf689f41a46b9e909877c21 upstream.

Incoming packet should be either in backlog queue or
in RCU read-side section. Otherwise, the final sequence of
flush_backlog() and synchronize_net() may miss packets
that can run without device reference:

CPU 1 CPU 2
skb->dev: no reference
process_backlog:__skb_dequeue
process_backlog:local_irq_enable

on_each_cpu for
flush_backlog => IPI(hardirq): flush_backlog
- packet not found in backlog

CPU delayed ...
synchronize_net
- no ongoing RCU
read-side sections

netdev_run_todo,
rcu_barrier: no
ongoing callbacks
__netif_receive_skb_core:rcu_read_lock
- too late
free dev
process packet for freed dev

Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: Julian Anastasov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4:
- adjust context
- no need to change "goto unlock" to "goto out"]
Signed-off-by: Zefan Li <[email protected]>
---
net/core/dev.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 1e363d0..4f679bf 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3191,8 +3191,6 @@ static int __netif_receive_skb(struct sk_buff *skb)

pt_prev = NULL;

- rcu_read_lock();
-
another_round:

__this_cpu_inc(softnet_data.processed);
@@ -3287,7 +3285,6 @@ ncls:
}

out:
- rcu_read_unlock();
return ret;
}

@@ -3308,29 +3305,30 @@ out:
*/
int netif_receive_skb(struct sk_buff *skb)
{
+ int ret;
+
net_timestamp_check(netdev_tstamp_prequeue, skb);

if (skb_defer_rx_timestamp(skb))
return NET_RX_SUCCESS;

+ rcu_read_lock();
+
#ifdef CONFIG_RPS
if (static_key_false(&rps_needed)) {
struct rps_dev_flow voidflow, *rflow = &voidflow;
- int cpu, ret;
-
- rcu_read_lock();
-
- cpu = get_rps_cpu(skb->dev, skb, &rflow);
+ int cpu = get_rps_cpu(skb->dev, skb, &rflow);

if (cpu >= 0) {
ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
rcu_read_unlock();
return ret;
}
- rcu_read_unlock();
}
#endif
- return __netif_receive_skb(skb);
+ ret = __netif_receive_skb(skb);
+ rcu_read_unlock();
+ return ret;
}
EXPORT_SYMBOL(netif_receive_skb);

@@ -3721,8 +3719,10 @@ static int process_backlog(struct napi_struct *napi, int quota)
unsigned int qlen;

while ((skb = __skb_dequeue(&sd->process_queue))) {
+ rcu_read_lock();
local_irq_enable();
__netif_receive_skb(skb);
+ rcu_read_unlock();
local_irq_disable();
input_queue_head_incr(sd);
if (++work >= quota) {
--
1.9.1

2016-03-16 08:08:43

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 026/107] libata: add ATA_HORKAGE_NOTRIM

From: Arne Fitzenreiter <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 71d126fd28de2d4d9b7b2088dbccd7ca62fad6e0 upstream.

Some devices lose data on TRIM whether queued or not. This patch adds
a horkage to disable TRIM.

tj: Collapsed unnecessary if() nesting.

Signed-off-by: Arne Fitzenreiter <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
[lizf: Backported to 3.4:
- adjust context
- drop changes to show_ata_dev_trim()]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/ata/libata-scsi.c | 3 ++-
include/linux/libata.h | 2 ++
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 15863a4..94ccbc1 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2461,7 +2461,8 @@ static unsigned int ata_scsiop_read_cap(struct ata_scsi_args *args, u8 *rbuf)
rbuf[14] = (lowest_aligned >> 8) & 0x3f;
rbuf[15] = lowest_aligned;

- if (ata_id_has_trim(args->id)) {
+ if (ata_id_has_trim(args->id) &&
+ !(dev->horkage & ATA_HORKAGE_NOTRIM)) {
rbuf[14] |= 0x80; /* TPE */

if (ata_id_has_zero_after_trim(args->id))
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 35e7f71..9736dbe 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -402,6 +402,8 @@ enum {
ATA_HORKAGE_BROKEN_FPDMA_AA = (1 << 15), /* skip AA */
ATA_HORKAGE_DUMP_ID = (1 << 16), /* dump IDENTIFY data */
ATA_HORKAGE_MAX_SEC_LBA48 = (1 << 17), /* Set max sects to 65535 */
+ ATA_HORKAGE_NOTRIM = (1 << 24), /* don't use TRIM */
+

/* DMA mask for user DMA control: User visible values; DO NOT
renumber */
--
1.9.1

2016-03-16 08:08:56

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 029/107] net: Clone skb before setting peeked flag

From: Herbert Xu <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 738ac1ebb96d02e0d23bc320302a6ea94c612dec upstream.

Shared skbs must not be modified and this is crucial for broadcast
and/or multicast paths where we use it as an optimisation to avoid
unnecessary cloning.

The function skb_recv_datagram breaks this rule by setting peeked
without cloning the skb first. This causes funky races which leads
to double-free.

This patch fixes this by cloning the skb and replacing the skb
in the list when setting skb->peeked.

Fixes: a59322be07c9 ("[UDP]: Only increment counter on first peek/recv")
Reported-by: Konstantin Khlebnikov <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/core/datagram.c | 41 ++++++++++++++++++++++++++++++++++++++---
1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/net/core/datagram.c b/net/core/datagram.c
index da7e0c8..ba96ad9 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -127,6 +127,35 @@ out_noerr:
goto out;
}

+static int skb_set_peeked(struct sk_buff *skb)
+{
+ struct sk_buff *nskb;
+
+ if (skb->peeked)
+ return 0;
+
+ /* We have to unshare an skb before modifying it. */
+ if (!skb_shared(skb))
+ goto done;
+
+ nskb = skb_clone(skb, GFP_ATOMIC);
+ if (!nskb)
+ return -ENOMEM;
+
+ skb->prev->next = nskb;
+ skb->next->prev = nskb;
+ nskb->prev = skb->prev;
+ nskb->next = skb->next;
+
+ consume_skb(skb);
+ skb = nskb;
+
+done:
+ skb->peeked = 1;
+
+ return 0;
+}
+
/**
* __skb_recv_datagram - Receive a datagram skbuff
* @sk: socket
@@ -161,7 +190,9 @@ out_noerr:
struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
int *peeked, int *off, int *err)
{
+ struct sk_buff_head *queue = &sk->sk_receive_queue;
struct sk_buff *skb;
+ unsigned long cpu_flags;
long timeo;
/*
* Caller is allowed not to check sk->sk_err before skb_recv_datagram()
@@ -180,8 +211,6 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
* Look at current nfs client by the way...
* However, this function was correct in any case. 8)
*/
- unsigned long cpu_flags;
- struct sk_buff_head *queue = &sk->sk_receive_queue;

spin_lock_irqsave(&queue->lock, cpu_flags);
skb_queue_walk(queue, skb) {
@@ -191,7 +220,11 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,
*off -= skb->len;
continue;
}
- skb->peeked = 1;
+
+ error = skb_set_peeked(skb);
+ if (error)
+ goto unlock_err;
+
atomic_inc(&skb->users);
} else
__skb_unlink(skb, queue);
@@ -210,6 +243,8 @@ struct sk_buff *__skb_recv_datagram(struct sock *sk, unsigned flags,

return NULL;

+unlock_err:
+ spin_unlock_irqrestore(&queue->lock, cpu_flags);
no_packet:
*err = error;
return NULL;
--
1.9.1

2016-03-16 08:09:03

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 030/107] NET: AX.25: Stop heartbeat timer on disconnect.

From: Richard Stearn <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit da278622bf04f8ddb14519a2b8214e108ef26101 upstream.

This may result in a kernel panic. The bug has always existed but
somehow we've run out of luck now and it bites.

Signed-off-by: Richard Stearn <[email protected]>
Signed-off-by: Ralf Baechle <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/ax25/ax25_subr.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/ax25/ax25_subr.c b/net/ax25/ax25_subr.c
index 1997538..3b78e84 100644
--- a/net/ax25/ax25_subr.c
+++ b/net/ax25/ax25_subr.c
@@ -264,6 +264,7 @@ void ax25_disconnect(ax25_cb *ax25, int reason)
{
ax25_clear_queues(ax25);

+ ax25_stop_heartbeat(ax25);
ax25_stop_t1timer(ax25);
ax25_stop_t2timer(ax25);
ax25_stop_t3timer(ax25);
--
1.9.1

2016-03-16 08:09:16

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 035/107] md: flush ->event_work before stopping array.

From: NeilBrown <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit ee5d004fd0591536a061451eba2b187092e9127c upstream.

The 'event_work' worker used by dm-raid may still be running
when the array is stopped. This can result in an oops.

So flush the workqueue on which it is run after detaching
and before destroying the device.

Reported-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Fixes: 9d09e663d550 ("dm: raid456 basic support")
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/md.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 25f0cb5..a875348 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5127,6 +5127,8 @@ static void __md_stop(struct mddev *mddev)
if (mddev->pers->sync_request && mddev->to_remove == NULL)
mddev->to_remove = &md_redundancy_group;
module_put(mddev->pers->owner);
+ /* Ensure ->event_work is done */
+ flush_workqueue(md_misc_wq);
mddev->pers = NULL;
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}
--
1.9.1

2016-03-16 08:09:28

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 040/107] xhci: prevent bus_suspend if SS port resuming in phase 1

From: Zhuang Jin Can <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit fac4271d1126c45ceaceb7f4a336317b771eb121 upstream.

When the link is just waken, it's in Resume state, and driver sets PLS to
U0. This refers to Phase 1. Phase 2 refers to when the link has completed
the transition from Resume state to U0.

With the fix of xhci: report U3 when link is in resume state, it also
exposes an issue that usb3 roothub and controller can suspend right
after phase 1, and this causes a hard hang in controller.

To fix the issue, we need to prevent usb3 bus suspend if any port is
resuming in phase 1.

[merge separate USB2 and USB3 port resume checking to one -Mathias]
Signed-off-by: Zhuang Jin Can <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/host/xhci-hub.c | 6 +++---
drivers/usb/host/xhci-ring.c | 3 +++
drivers/usb/host/xhci.h | 1 +
3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 9be0c29..fbbb11d 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -1011,10 +1011,10 @@ int xhci_bus_suspend(struct usb_hcd *hcd)
spin_lock_irqsave(&xhci->lock, flags);

if (hcd->self.root_hub->do_remote_wakeup) {
- if (bus_state->resuming_ports) {
+ if (bus_state->resuming_ports || /* USB2 */
+ bus_state->port_remote_wakeup) { /* USB3 */
spin_unlock_irqrestore(&xhci->lock, flags);
- xhci_dbg(xhci, "suspend failed because "
- "a port is resuming\n");
+ xhci_dbg(xhci, "suspend failed because a port is resuming\n");
return -EBUSY;
}
}
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 5e93425..b2afcb8 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1667,6 +1667,9 @@ static void handle_port_status(struct xhci_hcd *xhci,
usb_hcd_resume_root_hub(hcd);
}

+ if (hcd->speed == HCD_USB3 && (temp & PORT_PLS_MASK) == XDEV_INACTIVE)
+ bus_state->port_remote_wakeup &= ~(1 << faked_port_index);
+
if ((temp & PORT_PLC) && (temp & PORT_PLS_MASK) == XDEV_RESUME) {
xhci_dbg(xhci, "port resume event for port %d\n", port_id);

diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 855f084..f4116fc 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -280,6 +280,7 @@ struct xhci_op_regs {
#define XDEV_U0 (0x0 << 5)
#define XDEV_U2 (0x2 << 5)
#define XDEV_U3 (0x3 << 5)
+#define XDEV_INACTIVE (0x6 << 5)
#define XDEV_RESUME (0xf << 5)
/* true: port has power (see HCC_PPC) */
#define PORT_POWER (1 << 9)
--
1.9.1

2016-03-16 08:09:47

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 044/107] md/raid1: fix test for 'was read error from last working device'.

From: NeilBrown <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 34cab6f42003cb06f48f86a86652984dec338ae9 upstream.

When we get a read error from the last working device, we don't
try to repair it, and don't fail the device. We simple report a
read error to the caller.

However the current test for 'is this the last working device' is
wrong.
When there is only one fully working device, it assumes that a
non-faulty device is that device. However a spare which is rebuilding
would be non-faulty but so not the only working device.

So change the test from "!Faulty" to "In_sync". If ->degraded says
there is only one fully working device and this device is in_sync,
this must be the one.

This bug has existed since we allowed read_balance to read from
a recovering spare in v3.0

Reported-and-tested-by: Alexander Lyakas <[email protected]>
Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/raid1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 75e66c6..27af2f3 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -314,7 +314,7 @@ static void raid1_end_read_request(struct bio *bio, int error)
spin_lock_irqsave(&conf->device_lock, flags);
if (r1_bio->mddev->degraded == conf->raid_disks ||
(r1_bio->mddev->degraded == conf->raid_disks-1 &&
- !test_bit(Faulty, &conf->mirrors[mirror].rdev->flags)))
+ test_bit(In_sync, &conf->mirrors[mirror].rdev->flags)))
uptodate = 1;
spin_unlock_irqrestore(&conf->device_lock, flags);
}
--
1.9.1

2016-03-16 08:10:03

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 048/107] iscsi-target: Fix use-after-free during TPG session shutdown

From: Nicholas Bellinger <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 417c20a9bdd1e876384127cf096d8ae8b559066c upstream.

This patch fixes a use-after-free bug in iscsit_release_sessions_for_tpg()
where se_portal_group->session_lock was incorrectly released/re-acquired
while walking the active se_portal_group->tpg_sess_list.

The can result in a NULL pointer dereference when iscsit_close_session()
shutdown happens in the normal path asynchronously to this code, causing
a bogus dereference of an already freed list entry to occur.

To address this bug, walk the session list checking for the same state
as before, but move entries to a local list to avoid dropping the lock
while walking the active list.

As before, signal using iscsi_session->session_restatement=1 for those
list entries to be released locally by iscsit_free_session() code.

Reported-by: Sunilkumar Nadumuttlu <[email protected]>
Cc: Sunilkumar Nadumuttlu <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/target/iscsi/iscsi_target.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 56d02e0..79665f3 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4500,6 +4500,7 @@ int iscsit_release_sessions_for_tpg(struct iscsi_portal_group *tpg, int force)
struct iscsi_session *sess;
struct se_portal_group *se_tpg = &tpg->tpg_se_tpg;
struct se_session *se_sess, *se_sess_tmp;
+ LIST_HEAD(free_list);
int session_count = 0;

spin_lock_bh(&se_tpg->session_lock);
@@ -4521,14 +4522,17 @@ int iscsit_release_sessions_for_tpg(struct iscsi_portal_group *tpg, int force)
}
atomic_set(&sess->session_reinstatement, 1);
spin_unlock(&sess->conn_lock);
- spin_unlock_bh(&se_tpg->session_lock);

- iscsit_free_session(sess);
- spin_lock_bh(&se_tpg->session_lock);
+ list_move_tail(&se_sess->sess_list, &free_list);
+ }
+ spin_unlock_bh(&se_tpg->session_lock);

+ list_for_each_entry_safe(se_sess, se_sess_tmp, &free_list, sess_list) {
+ sess = (struct iscsi_session *)se_sess->fabric_sess_ptr;
+
+ iscsit_free_session(sess);
session_count++;
}
- spin_unlock_bh(&se_tpg->session_lock);

pr_debug("Released %d iSCSI Session(s) from Target Portal"
" Group: %hu\n", session_count, tpg->tpgt);
--
1.9.1

2016-03-16 08:10:10

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 053/107] drm/radeon/combios: add some validation of lvds values

From: Alex Deucher <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 0a90a0cff9f429f886f423967ae053150dce9259 upstream.

Fixes a broken hsync start value uncovered by:
abc0b1447d4974963548777a5ba4a4457c82c426
(drm: Perform basic sanity checks on probed modes)

The driver handled the bad hsync start elsewhere, but
the above commit prevented it from getting added.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=91401

Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/gpu/drm/radeon/radeon_combios.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_combios.c b/drivers/gpu/drm/radeon/radeon_combios.c
index cf5dd63..b72eb50 100644
--- a/drivers/gpu/drm/radeon/radeon_combios.c
+++ b/drivers/gpu/drm/radeon/radeon_combios.c
@@ -1259,10 +1259,15 @@ struct radeon_encoder_lvds *radeon_combios_get_lvds_info(struct radeon_encoder

if ((RBIOS16(tmp) == lvds->native_mode.hdisplay) &&
(RBIOS16(tmp + 2) == lvds->native_mode.vdisplay)) {
+ u32 hss = (RBIOS16(tmp + 21) - RBIOS16(tmp + 19) - 1) * 8;
+
+ if (hss > lvds->native_mode.hdisplay)
+ hss = (10 - 1) * 8;
+
lvds->native_mode.htotal = lvds->native_mode.hdisplay +
(RBIOS16(tmp + 17) - RBIOS16(tmp + 19)) * 8;
lvds->native_mode.hsync_start = lvds->native_mode.hdisplay +
- (RBIOS16(tmp + 21) - RBIOS16(tmp + 19) - 1) * 8;
+ hss;
lvds->native_mode.hsync_end = lvds->native_mode.hsync_start +
(RBIOS8(tmp + 23) * 8);

--
1.9.1

2016-03-16 08:10:35

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 055/107] x86/xen: Probe target addresses in set_aliased_prot() before the hypercall

From: Andy Lutomirski <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit aa1acff356bbedfd03b544051f5b371746735d89 upstream.

The update_va_mapping hypercall can fail if the VA isn't present
in the guest's page tables. Under certain loads, this can
result in an OOPS when the target address is in unpopulated vmap
space.

While we're at it, add comments to help explain what's going on.

This isn't a great long-term fix. This code should probably be
changed to use something like set_memory_ro.

Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: David Vrabel <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected] <[email protected]>
Cc: xen-devel <[email protected]>
Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/xen/enlighten.c | 40 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 40 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 9598038..8ade106 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -413,6 +413,7 @@ static void set_aliased_prot(void *v, pgprot_t prot)
pte_t pte;
unsigned long pfn;
struct page *page;
+ unsigned char dummy;

ptep = lookup_address((unsigned long)v, &level);
BUG_ON(ptep == NULL);
@@ -422,6 +423,32 @@ static void set_aliased_prot(void *v, pgprot_t prot)

pte = pfn_pte(pfn, prot);

+ /*
+ * Careful: update_va_mapping() will fail if the virtual address
+ * we're poking isn't populated in the page tables. We don't
+ * need to worry about the direct map (that's always in the page
+ * tables), but we need to be careful about vmap space. In
+ * particular, the top level page table can lazily propagate
+ * entries between processes, so if we've switched mms since we
+ * vmapped the target in the first place, we might not have the
+ * top-level page table entry populated.
+ *
+ * We disable preemption because we want the same mm active when
+ * we probe the target and when we issue the hypercall. We'll
+ * have the same nominal mm, but if we're a kernel thread, lazy
+ * mm dropping could change our pgd.
+ *
+ * Out of an abundance of caution, this uses __get_user() to fault
+ * in the target address just in case there's some obscure case
+ * in which the target address isn't readable.
+ */
+
+ preempt_disable();
+
+ pagefault_disable(); /* Avoid warnings due to being atomic. */
+ __get_user(dummy, (unsigned char __user __force *)v);
+ pagefault_enable();
+
if (HYPERVISOR_update_va_mapping((unsigned long)v, pte, 0))
BUG();

@@ -433,6 +460,8 @@ static void set_aliased_prot(void *v, pgprot_t prot)
BUG();
} else
kmap_flush_unused();
+
+ preempt_enable();
}

static void xen_alloc_ldt(struct desc_struct *ldt, unsigned entries)
@@ -440,6 +469,17 @@ static void xen_alloc_ldt(struct desc_struct *ldt, unsigned entries)
const unsigned entries_per_page = PAGE_SIZE / LDT_ENTRY_SIZE;
int i;

+ /*
+ * We need to mark the all aliases of the LDT pages RO. We
+ * don't need to call vm_flush_aliases(), though, since that's
+ * only responsible for flushing aliases out the TLBs, not the
+ * page tables, and Xen will flush the TLB for us if needed.
+ *
+ * To avoid confusing future readers: none of this is necessary
+ * to load the LDT. The hypervisor only checks this when the
+ * LDT is faulted in due to subsequent descriptor access.
+ */
+
for(i = 0; i < entries; i += entries_per_page)
set_aliased_prot(ldt + i, PAGE_KERNEL_RO);
}
--
1.9.1

2016-03-16 08:10:41

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 054/107] target/iscsi: Fix double free of a TUR followed by a solicited NOPOUT

From: Alexei Potashnik <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 9547308bda296b6f69876c840a0291fcfbeddbb8 upstream.

Make sure all non-READ SCSI commands get targ_xfer_tag initialized
to 0xffffffff, not just WRITEs.

Double-free of a TUR cmd object occurs under the following scenario:

1. TUR received (targ_xfer_tag is uninitialized and left at 0)
2. TUR status sent
3. First unsolicited NOPIN is sent to initiator (gets targ_xfer_tag of 0)
4. NOPOUT for NOPIN (with TTT=0) arrives
- its ExpStatSN acks TUR status, TUR is queued for removal
- LIO tries to find NOPIN with TTT=0, but finds the same TUR instead,
TUR is queued for removal for the 2nd time

(Drop unbalanced conditional bracket usage - nab)

Signed-off-by: Alexei Potashnik <[email protected]>
Signed-off-by: Spencer Baugh <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
[lizf: Backported to 3.4:
- adjust context
- leave the braces as it is]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/target/iscsi/iscsi_target.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
index 79665f3..963f383 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -981,7 +981,7 @@ done:
if (cmd->targ_xfer_tag == 0xFFFFFFFF)
cmd->targ_xfer_tag = conn->sess->targ_xfer_tag++;
spin_unlock_bh(&conn->sess->ttt_lock);
- } else if (hdr->flags & ISCSI_FLAG_CMD_WRITE)
+ } else
cmd->targ_xfer_tag = 0xFFFFFFFF;
cmd->cmd_sn = hdr->cmdsn;
cmd->exp_stat_sn = hdr->exp_statsn;
--
1.9.1

2016-03-16 08:11:06

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 059/107] xhci: fix off by one error in TRB DMA address boundary check

From: Mathias Nyman <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 7895086afde2a05fa24a0e410d8e6b75ca7c8fdd upstream.

We need to check that a TRB is part of the current segment
before calculating its DMA address.

Previously a ring segment didn't use a full memory page, and every
new ring segment got a new memory page, so the off by one
error in checking the upper bound was never seen.

Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
didn't catch the case when a TRB was the first element of the next segment.

This is triggered if the virtual memory pages for a ring segment are
next to each in increasing order where the ring buffer wraps around and
causes errors like:

[ 106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1
[ 106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0

The trb-end address is one outside the end-seg address.

Tested-by: Arkadiusz Miśkiewicz <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/host/xhci-ring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b2afcb8..5623785 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -85,7 +85,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
return 0;
/* offset in TRBs */
segment_offset = trb - seg->trbs;
- if (segment_offset > TRBS_PER_SEGMENT)
+ if (segment_offset >= TRBS_PER_SEGMENT)
return 0;
return seg->dma + (segment_offset * sizeof(*trb));
}
--
1.9.1

2016-03-16 08:11:23

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 066/107] localmodconfig: Use Kbuild files too

From: Richard Weinberger <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit c0ddc8c745b7f89c50385fd7aa03c78dc543fa7a upstream.

In kbuild it is allowed to define objects in files named "Makefile"
and "Kbuild".
Currently localmodconfig reads objects only from "Makefile"s and misses
modules like nouveau.

Link: http://lkml.kernel.org/r/[email protected]

Reported-and-tested-by: Leonidas Spyropoulos <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
scripts/kconfig/streamline_config.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/kconfig/streamline_config.pl b/scripts/kconfig/streamline_config.pl
index 3346f42..4a19a7f 100644
--- a/scripts/kconfig/streamline_config.pl
+++ b/scripts/kconfig/streamline_config.pl
@@ -125,7 +125,7 @@ my $ksource = $ARGV[0];
my $kconfig = $ARGV[1];
my $lsmod_file = $ENV{'LSMOD'};

-my @makefiles = `find $ksource -name Makefile 2>/dev/null`;
+my @makefiles = `find $ksource -name Makefile -or -name Kbuild 2>/dev/null`;
chomp @makefiles;

my %depends;
--
1.9.1

2016-03-16 08:11:37

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 067/107] dm btree: add ref counting ops for the leaves of top level btrees

From: Joe Thornber <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit b0dc3c8bc157c60b1d470163882be8c13e1950af upstream.

When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down. If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented
accordingly.

This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp). And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
[lizf: Backported to 3.4:
- drop const
- drop changes to remove_one()]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/persistent-data/dm-btree-internal.h | 6 +++++
drivers/md/persistent-data/dm-btree-remove.c | 12 +++------
drivers/md/persistent-data/dm-btree-spine.c | 37 ++++++++++++++++++++++++++
drivers/md/persistent-data/dm-btree.c | 7 +----
4 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/drivers/md/persistent-data/dm-btree-internal.h b/drivers/md/persistent-data/dm-btree-internal.h
index accbb05..c246578 100644
--- a/drivers/md/persistent-data/dm-btree-internal.h
+++ b/drivers/md/persistent-data/dm-btree-internal.h
@@ -131,4 +131,10 @@ int lower_bound(struct btree_node *n, uint64_t key);

extern struct dm_block_validator btree_node_validator;

+/*
+ * Value type for upper levels of multi-level btrees.
+ */
+extern void init_le64_type(struct dm_transaction_manager *tm,
+ struct dm_btree_value_type *vt);
+
#endif /* DM_BTREE_INTERNAL_H */
diff --git a/drivers/md/persistent-data/dm-btree-remove.c b/drivers/md/persistent-data/dm-btree-remove.c
index a03178e..7c0d755 100644
--- a/drivers/md/persistent-data/dm-btree-remove.c
+++ b/drivers/md/persistent-data/dm-btree-remove.c
@@ -544,14 +544,6 @@ static int remove_raw(struct shadow_spine *s, struct dm_btree_info *info,
return r;
}

-static struct dm_btree_value_type le64_type = {
- .context = NULL,
- .size = sizeof(__le64),
- .inc = NULL,
- .dec = NULL,
- .equal = NULL
-};
-
int dm_btree_remove(struct dm_btree_info *info, dm_block_t root,
uint64_t *keys, dm_block_t *new_root)
{
@@ -559,12 +551,14 @@ int dm_btree_remove(struct dm_btree_info *info, dm_block_t root,
int index = 0, r = 0;
struct shadow_spine spine;
struct btree_node *n;
+ struct dm_btree_value_type le64_vt;

+ init_le64_type(info->tm, &le64_vt);
init_shadow_spine(&spine, info);
for (level = 0; level < info->levels; level++) {
r = remove_raw(&spine, info,
(level == last_level ?
- &info->value_type : &le64_type),
+ &info->value_type : &le64_vt),
root, keys[level], (unsigned *)&index);
if (r < 0)
break;
diff --git a/drivers/md/persistent-data/dm-btree-spine.c b/drivers/md/persistent-data/dm-btree-spine.c
index 2f0805c..f6cb762 100644
--- a/drivers/md/persistent-data/dm-btree-spine.c
+++ b/drivers/md/persistent-data/dm-btree-spine.c
@@ -242,3 +242,40 @@ int shadow_root(struct shadow_spine *s)
{
return s->root;
}
+
+static void le64_inc(void *context, void *value_le)
+{
+ struct dm_transaction_manager *tm = context;
+ __le64 v_le;
+
+ memcpy(&v_le, value_le, sizeof(v_le));
+ dm_tm_inc(tm, le64_to_cpu(v_le));
+}
+
+static void le64_dec(void *context, void *value_le)
+{
+ struct dm_transaction_manager *tm = context;
+ __le64 v_le;
+
+ memcpy(&v_le, value_le, sizeof(v_le));
+ dm_tm_dec(tm, le64_to_cpu(v_le));
+}
+
+static int le64_equal(void *context, void *value1_le, void *value2_le)
+{
+ __le64 v1_le, v2_le;
+
+ memcpy(&v1_le, value1_le, sizeof(v1_le));
+ memcpy(&v2_le, value2_le, sizeof(v2_le));
+ return v1_le == v2_le;
+}
+
+void init_le64_type(struct dm_transaction_manager *tm,
+ struct dm_btree_value_type *vt)
+{
+ vt->context = tm;
+ vt->size = sizeof(__le64);
+ vt->inc = le64_inc;
+ vt->dec = le64_dec;
+ vt->equal = le64_equal;
+}
diff --git a/drivers/md/persistent-data/dm-btree.c b/drivers/md/persistent-data/dm-btree.c
index d05cf15..dddd5a4 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -646,12 +646,7 @@ static int insert(struct dm_btree_info *info, dm_block_t root,
struct btree_node *n;
struct dm_btree_value_type le64_type;

- le64_type.context = NULL;
- le64_type.size = sizeof(__le64);
- le64_type.inc = NULL;
- le64_type.dec = NULL;
- le64_type.equal = NULL;
-
+ init_le64_type(info->tm, &le64_type);
init_shadow_spine(&spine, info);

for (level = 0; level < (info->levels - 1); level++) {
--
1.9.1

2016-03-16 08:11:58

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 073/107] sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state

From: lucien <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit f648f807f61e64d247d26611e34cc97e4ed03401 upstream.

Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
resetting the association overall_error_count. This allowed the association
to better enforce assoc.max_retrans limit.

However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
state. In this state, HB-ACKs will continue to reset the overall_error_count
for the association would extend the lifetime of association unnecessarily.

This patch solves this by resetting the overall_error_count whenever the current
state is small then SCTP_STATE_SHUTDOWN_PENDING. As a small side-effect, we
end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT
states, but they are not really impacted because we disable Heartbeats in those
states.

Fixes: Commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
Signed-off-by: Xin Long <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/sctp/sm_sideeffect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 1ff51c9..5fa033a 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -682,7 +682,7 @@ static void sctp_cmd_transport_on(sctp_cmd_seq_t *cmds,
* outstanding data and rely on the retransmission limit be reached
* to shutdown the association.
*/
- if (t->asoc->state != SCTP_STATE_SHUTDOWN_PENDING)
+ if (t->asoc->state < SCTP_STATE_SHUTDOWN_PENDING)
t->asoc->overall_error_count = 0;

/* Clear the hb_sent flag to signal that we had a good
--
1.9.1

2016-03-16 08:12:01

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 075/107] unix: avoid use-after-free in ep_remove_wait_queue

From: Rainer Weikusat <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 7d267278a9ece963d77eefec61630223fce08c6c upstream.

Rainer Weikusat <[email protected]> writes:
An AF_UNIX datagram socket being the client in an n:1 association with
some server socket is only allowed to send messages to the server if the
receive queue of this socket contains at most sk_max_ack_backlog
datagrams. This implies that prospective writers might be forced to go
to sleep despite none of the message presently enqueued on the server
receive queue were sent by them. In order to ensure that these will be
woken up once space becomes again available, the present unix_dgram_poll
routine does a second sock_poll_wait call with the peer_wait wait queue
of the server socket as queue argument (unix_dgram_recvmsg does a wake
up on this queue after a datagram was received). This is inherently
problematic because the server socket is only guaranteed to remain alive
for as long as the client still holds a reference to it. In case the
connection is dissolved via connect or by the dead peer detection logic
in unix_dgram_sendmsg, the server socket may be freed despite "the
polling mechanism" (in particular, epoll) still has a pointer to the
corresponding peer_wait queue. There's no way to forcibly deregister a
wait queue with epoll.

Based on an idea by Jason Baron, the patch below changes the code such
that a wait_queue_t belonging to the client socket is enqueued on the
peer_wait queue of the server whenever the peer receive queue full
condition is detected by either a sendmsg or a poll. A wake up on the
peer queue is then relayed to the ordinary wait queue of the client
socket via wake function. The connection to the peer wait queue is again
dissolved if either a wake up is about to be relayed or the client
socket reconnects or a dead peer is detected or the client socket is
itself closed. This enables removing the second sock_poll_wait from
unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
that no blocked writer sleeps forever.

Signed-off-by: Rainer Weikusat <[email protected]>
Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets")
Reviewed-by: Jason Baron <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
include/net/af_unix.h | 1 +
net/unix/af_unix.c | 183 ++++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 165 insertions(+), 19 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index ca68e2c..d29a576 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -60,6 +60,7 @@ struct unix_sock {
unsigned int gc_maybe_cycle : 1;
unsigned char recursion_level;
struct socket_wq peer_wq;
+ wait_queue_t peer_wake;
};
#define unix_sk(__sk) ((struct unix_sock *)__sk)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 09a84c9..9e120d7 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -306,6 +306,118 @@ found:
return s;
}

+/* Support code for asymmetrically connected dgram sockets
+ *
+ * If a datagram socket is connected to a socket not itself connected
+ * to the first socket (eg, /dev/log), clients may only enqueue more
+ * messages if the present receive queue of the server socket is not
+ * "too large". This means there's a second writeability condition
+ * poll and sendmsg need to test. The dgram recv code will do a wake
+ * up on the peer_wait wait queue of a socket upon reception of a
+ * datagram which needs to be propagated to sleeping would-be writers
+ * since these might not have sent anything so far. This can't be
+ * accomplished via poll_wait because the lifetime of the server
+ * socket might be less than that of its clients if these break their
+ * association with it or if the server socket is closed while clients
+ * are still connected to it and there's no way to inform "a polling
+ * implementation" that it should let go of a certain wait queue
+ *
+ * In order to propagate a wake up, a wait_queue_t of the client
+ * socket is enqueued on the peer_wait queue of the server socket
+ * whose wake function does a wake_up on the ordinary client socket
+ * wait queue. This connection is established whenever a write (or
+ * poll for write) hit the flow control condition and broken when the
+ * association to the server socket is dissolved or after a wake up
+ * was relayed.
+ */
+
+static int unix_dgram_peer_wake_relay(wait_queue_t *q, unsigned mode, int flags,
+ void *key)
+{
+ struct unix_sock *u;
+ wait_queue_head_t *u_sleep;
+
+ u = container_of(q, struct unix_sock, peer_wake);
+
+ __remove_wait_queue(&unix_sk(u->peer_wake.private)->peer_wait,
+ q);
+ u->peer_wake.private = NULL;
+
+ /* relaying can only happen while the wq still exists */
+ u_sleep = sk_sleep(&u->sk);
+ if (u_sleep)
+ wake_up_interruptible_poll(u_sleep, key);
+
+ return 0;
+}
+
+static int unix_dgram_peer_wake_connect(struct sock *sk, struct sock *other)
+{
+ struct unix_sock *u, *u_other;
+ int rc;
+
+ u = unix_sk(sk);
+ u_other = unix_sk(other);
+ rc = 0;
+ spin_lock(&u_other->peer_wait.lock);
+
+ if (!u->peer_wake.private) {
+ u->peer_wake.private = other;
+ __add_wait_queue(&u_other->peer_wait, &u->peer_wake);
+
+ rc = 1;
+ }
+
+ spin_unlock(&u_other->peer_wait.lock);
+ return rc;
+}
+
+static void unix_dgram_peer_wake_disconnect(struct sock *sk,
+ struct sock *other)
+{
+ struct unix_sock *u, *u_other;
+
+ u = unix_sk(sk);
+ u_other = unix_sk(other);
+ spin_lock(&u_other->peer_wait.lock);
+
+ if (u->peer_wake.private == other) {
+ __remove_wait_queue(&u_other->peer_wait, &u->peer_wake);
+ u->peer_wake.private = NULL;
+ }
+
+ spin_unlock(&u_other->peer_wait.lock);
+}
+
+static void unix_dgram_peer_wake_disconnect_wakeup(struct sock *sk,
+ struct sock *other)
+{
+ unix_dgram_peer_wake_disconnect(sk, other);
+ wake_up_interruptible_poll(sk_sleep(sk),
+ POLLOUT |
+ POLLWRNORM |
+ POLLWRBAND);
+}
+
+/* preconditions:
+ * - unix_peer(sk) == other
+ * - association is stable
+ */
+static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other)
+{
+ int connected;
+
+ connected = unix_dgram_peer_wake_connect(sk, other);
+
+ if (unix_recvq_full(other))
+ return 1;
+
+ if (connected)
+ unix_dgram_peer_wake_disconnect(sk, other);
+
+ return 0;
+}
+
static inline int unix_writable(struct sock *sk)
{
return (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
@@ -410,6 +522,8 @@ static void unix_release_sock(struct sock *sk, int embrion)
skpair->sk_state_change(skpair);
sk_wake_async(skpair, SOCK_WAKE_WAITD, POLL_HUP);
}
+
+ unix_dgram_peer_wake_disconnect(sk, skpair);
sock_put(skpair); /* It may now die */
unix_peer(sk) = NULL;
}
@@ -646,6 +760,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock)
INIT_LIST_HEAD(&u->link);
mutex_init(&u->readlock); /* single task reading lock */
init_waitqueue_head(&u->peer_wait);
+ init_waitqueue_func_entry(&u->peer_wake, unix_dgram_peer_wake_relay);
unix_insert_socket(unix_sockets_unbound, sk);
out:
if (sk == NULL)
@@ -1020,6 +1135,8 @@ restart:
if (unix_peer(sk)) {
struct sock *old_peer = unix_peer(sk);
unix_peer(sk) = other;
+ unix_dgram_peer_wake_disconnect_wakeup(sk, old_peer);
+
unix_state_double_unlock(sk, other);

if (other != old_peer)
@@ -1459,6 +1576,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
long timeo;
struct scm_cookie tmp_scm;
int max_level;
+ int sk_locked;

if (NULL == siocb->scm)
siocb->scm = &tmp_scm;
@@ -1527,12 +1645,14 @@ restart:
goto out_free;
}

+ sk_locked = 0;
unix_state_lock(other);
+restart_locked:
err = -EPERM;
if (!unix_may_send(sk, other))
goto out_unlock;

- if (sock_flag(other, SOCK_DEAD)) {
+ if (unlikely(sock_flag(other, SOCK_DEAD))) {
/*
* Check with 1003.1g - what should
* datagram error
@@ -1540,10 +1660,14 @@ restart:
unix_state_unlock(other);
sock_put(other);

+ if (!sk_locked)
+ unix_state_lock(sk);
+
err = 0;
- unix_state_lock(sk);
if (unix_peer(sk) == other) {
unix_peer(sk) = NULL;
+ unix_dgram_peer_wake_disconnect_wakeup(sk, other);
+
unix_state_unlock(sk);

unix_dgram_disconnected(sk, other);
@@ -1569,21 +1693,38 @@ restart:
goto out_unlock;
}

- if (unix_peer(other) != sk && unix_recvq_full(other)) {
- if (!timeo) {
- err = -EAGAIN;
- goto out_unlock;
+ if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
+ if (timeo) {
+ timeo = unix_wait_for_peer(other, timeo);
+
+ err = sock_intr_errno(timeo);
+ if (signal_pending(current))
+ goto out_free;
+
+ goto restart;
}

- timeo = unix_wait_for_peer(other, timeo);
+ if (!sk_locked) {
+ unix_state_unlock(other);
+ unix_state_double_lock(sk, other);
+ }

- err = sock_intr_errno(timeo);
- if (signal_pending(current))
- goto out_free;
+ if (unix_peer(sk) != other ||
+ unix_dgram_peer_wake_me(sk, other)) {
+ err = -EAGAIN;
+ sk_locked = 1;
+ goto out_unlock;
+ }

- goto restart;
+ if (!sk_locked) {
+ sk_locked = 1;
+ goto restart_locked;
+ }
}

+ if (unlikely(sk_locked))
+ unix_state_unlock(sk);
+
if (sock_flag(other, SOCK_RCVTSTAMP))
__net_timestamp(skb);
maybe_add_creds(skb, sock, other);
@@ -1597,6 +1738,8 @@ restart:
return len;

out_unlock:
+ if (sk_locked)
+ unix_state_unlock(sk);
unix_state_unlock(other);
out_free:
kfree_skb(skb);
@@ -2229,14 +2372,16 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
return mask;

writable = unix_writable(sk);
- other = unix_peer_get(sk);
- if (other) {
- if (unix_peer(other) != sk) {
- sock_poll_wait(file, &unix_sk(other)->peer_wait, wait);
- if (unix_recvq_full(other))
- writable = 0;
- }
- sock_put(other);
+ if (writable) {
+ unix_state_lock(sk);
+
+ other = unix_peer(sk);
+ if (other && unix_peer(other) != sk &&
+ unix_recvq_full(other) &&
+ unix_dgram_peer_wake_me(sk, other))
+ writable = 0;
+
+ unix_state_unlock(sk);
}

if (writable)
--
1.9.1

2016-03-16 08:12:13

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 082/107] isdn_ppp: Add checks for allocation failure in isdn_ppp_open()

From: Ben Hutchings <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 0baa57d8dc32db78369d8b5176ef56c5e2e18ab3 upstream.

Compile-tested only.

Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/isdn/i4l/isdn_ppp.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/isdn/i4l/isdn_ppp.c b/drivers/isdn/i4l/isdn_ppp.c
index a1e7601..ee41751 100644
--- a/drivers/isdn/i4l/isdn_ppp.c
+++ b/drivers/isdn/i4l/isdn_ppp.c
@@ -301,6 +301,8 @@ isdn_ppp_open(int min, struct file *file)
is->compflags = 0;

is->reset = isdn_ppp_ccp_reset_alloc(is);
+ if (!is->reset)
+ return -ENOMEM;

is->lp = NULL;
is->mp_seqno = 0; /* MP sequence number */
@@ -320,6 +322,10 @@ isdn_ppp_open(int min, struct file *file)
* VJ header compression init
*/
is->slcomp = slhc_init(16, 16); /* not necessary for 2. link in bundle */
+ if (!is->slcomp) {
+ isdn_ppp_ccp_reset_free(is);
+ return -ENOMEM;
+ }
#endif
#ifdef CONFIG_IPPP_FILTER
is->pass_filter = NULL;
--
1.9.1

2016-03-16 08:12:22

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 083/107] ppp, slip: Validate VJ compression slot parameters completely

From: Ben Hutchings <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4ab42d78e37a294ac7bc56901d563c642e03c4ae upstream.

Currently slhc_init() treats out-of-range values of rslots and tslots
as equivalent to 0, except that if tslots is too large it will
dereference a null pointer (CVE-2015-7799).

Add a range-check at the top of the function and make it return an
ERR_PTR() on error instead of NULL. Change the callers accordingly.

Compile-tested only.

Reported-by: 郭永刚 <[email protected]>
References: http://article.gmane.org/gmane.comp.security.oss.general/17908
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/isdn/i4l/isdn_ppp.c | 10 ++++------
drivers/net/ppp/ppp_generic.c | 6 ++----
drivers/net/slip/slhc.c | 12 ++++++++----
drivers/net/slip/slip.c | 2 +-
4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/isdn/i4l/isdn_ppp.c b/drivers/isdn/i4l/isdn_ppp.c
index ee41751..7d6c170 100644
--- a/drivers/isdn/i4l/isdn_ppp.c
+++ b/drivers/isdn/i4l/isdn_ppp.c
@@ -322,9 +322,9 @@ isdn_ppp_open(int min, struct file *file)
* VJ header compression init
*/
is->slcomp = slhc_init(16, 16); /* not necessary for 2. link in bundle */
- if (!is->slcomp) {
+ if (IS_ERR(is->slcomp)) {
isdn_ppp_ccp_reset_free(is);
- return -ENOMEM;
+ return PTR_ERR(is->slcomp);
}
#endif
#ifdef CONFIG_IPPP_FILTER
@@ -574,10 +574,8 @@ isdn_ppp_ioctl(int min, struct file *file, unsigned int cmd, unsigned long arg)
is->maxcid = val;
#ifdef CONFIG_ISDN_PPP_VJ
sltmp = slhc_init(16, val);
- if (!sltmp) {
- printk(KERN_ERR "ippp, can't realloc slhc struct\n");
- return -ENOMEM;
- }
+ if (IS_ERR(sltmp))
+ return PTR_ERR(sltmp);
if (is->slcomp)
slhc_free(is->slcomp);
is->slcomp = sltmp;
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 1207bb1..ba4411b 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -703,10 +703,8 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
val &= 0xffff;
}
vj = slhc_init(val2+1, val+1);
- if (!vj) {
- netdev_err(ppp->dev,
- "PPP: no memory (VJ compressor)\n");
- err = -ENOMEM;
+ if (IS_ERR(vj)) {
+ err = PTR_ERR(vj);
break;
}
ppp_lock(ppp);
diff --git a/drivers/net/slip/slhc.c b/drivers/net/slip/slhc.c
index 1252d9c..b52eabc 100644
--- a/drivers/net/slip/slhc.c
+++ b/drivers/net/slip/slhc.c
@@ -84,8 +84,9 @@ static long decode(unsigned char **cpp);
static unsigned char * put16(unsigned char *cp, unsigned short x);
static unsigned short pull16(unsigned char **cpp);

-/* Initialize compression data structure
+/* Allocate compression data structure
* slots must be in range 0 to 255 (zero meaning no compression)
+ * Returns pointer to structure or ERR_PTR() on error.
*/
struct slcompress *
slhc_init(int rslots, int tslots)
@@ -94,11 +95,14 @@ slhc_init(int rslots, int tslots)
register struct cstate *ts;
struct slcompress *comp;

+ if (rslots < 0 || rslots > 255 || tslots < 0 || tslots > 255)
+ return ERR_PTR(-EINVAL);
+
comp = kzalloc(sizeof(struct slcompress), GFP_KERNEL);
if (! comp)
goto out_fail;

- if ( rslots > 0 && rslots < 256 ) {
+ if (rslots > 0) {
size_t rsize = rslots * sizeof(struct cstate);
comp->rstate = kzalloc(rsize, GFP_KERNEL);
if (! comp->rstate)
@@ -106,7 +110,7 @@ slhc_init(int rslots, int tslots)
comp->rslot_limit = rslots - 1;
}

- if ( tslots > 0 && tslots < 256 ) {
+ if (tslots > 0) {
size_t tsize = tslots * sizeof(struct cstate);
comp->tstate = kzalloc(tsize, GFP_KERNEL);
if (! comp->tstate)
@@ -141,7 +145,7 @@ out_free2:
out_free:
kfree(comp);
out_fail:
- return NULL;
+ return ERR_PTR(-ENOMEM);
}


diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index d4c9db3..1f22662 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -163,7 +163,7 @@ static int sl_alloc_bufs(struct slip *sl, int mtu)
if (cbuff == NULL)
goto err_exit;
slcomp = slhc_init(16, 16);
- if (slcomp == NULL)
+ if (IS_ERR(slcomp))
goto err_exit;
#endif
spin_lock_bh(&sl->lock);
--
1.9.1

2016-03-16 08:12:33

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 086/107] KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring

From: David Howells <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit f05819df10d7b09f6d1eb6f8534a8f68e5a4fe61 upstream.

The following sequence of commands:

i=`keyctl add user a a @s`
keyctl request2 keyring foo bar @t
keyctl unlink $i @s

tries to invoke an upcall to instantiate a keyring if one doesn't already
exist by that name within the user's keyring set. However, if the upcall
fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
other error code. When the key is garbage collected, the key destroy
function is called unconditionally and keyring_destroy() uses list_empty()
on keyring->type_data.link - which is in a union with reject_error.
Subsequently, the kernel tries to unlink the keyring from the keyring names
list - which oopses like this:

BUG: unable to handle kernel paging request at 00000000ffffff8a
IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
...
Workqueue: events key_garbage_collector
...
RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
RSP: 0018:ffff88003e2f3d30 EFLAGS: 00010203
RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
...
CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
...
Call Trace:
[<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
[<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
[<ffffffff8105ec9b>] process_one_work+0x28e/0x547
[<ffffffff8105fd17>] worker_thread+0x26e/0x361
[<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
[<ffffffff810648ad>] kthread+0xf3/0xfb
[<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
[<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
[<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2

Note the value in RAX. This is a 32-bit representation of -ENOKEY.

The solution is to only call ->destroy() if the key was successfully
instantiated.

Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: David Howells <[email protected]>
Tested-by: Dmitry Vyukov <[email protected]>
[lizf: Backported to 3.4: adjust indentation]
Signed-off-by: Zefan Li <[email protected]>
---
security/keys/gc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/security/keys/gc.c b/security/keys/gc.c
index f85d638..9e496ad 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -174,8 +174,10 @@ static noinline void key_gc_unused_key(struct key *key)
{
key_check(key);

- /* Throw away the key data */
- if (key->type->destroy)
+ /* Throw away the key data if the key is instantiated */
+ if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags) &&
+ !test_bit(KEY_FLAG_NEGATIVE, &key->flags) &&
+ key->type->destroy)
key->type->destroy(key);

security_key_free(key);
--
1.9.1

2016-03-16 08:12:37

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 087/107] ipv6: addrconf: validate new MTU before applying it

From: Marcelo Leitner <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 77751427a1ff25b27d47a4c36b12c3c8667855ac upstream.

Currently we don't check if the new MTU is valid or not and this allows
one to configure a smaller than minimum allowed by RFCs or even bigger
than interface own MTU, which is a problem as it may lead to packet
drops.

If you have a daemon like NetworkManager running, this may be exploited
by remote attackers by forging RA packets with an invalid MTU, possibly
leading to a DoS. (NetworkManager currently only validates for values
too small, but not for too big ones.)

The fix is just to make sure the new value is valid. That is, between
IPV6_MIN_MTU and interface's MTU.

Note that similar check is already performed at
ndisc_router_discovery(), for when kernel itself parses the RA.

Signed-off-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/ipv6/addrconf.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index be58760..b6c236b 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4340,6 +4340,21 @@ int addrconf_sysctl_forward(ctl_table *ctl, int write,
return ret;
}

+static
+int addrconf_sysctl_mtu(struct ctl_table *ctl, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct inet6_dev *idev = ctl->extra1;
+ int min_mtu = IPV6_MIN_MTU;
+ struct ctl_table lctl;
+
+ lctl = *ctl;
+ lctl.extra1 = &min_mtu;
+ lctl.extra2 = idev ? &idev->dev->mtu : NULL;
+
+ return proc_dointvec_minmax(&lctl, write, buffer, lenp, ppos);
+}
+
static void dev_disable_change(struct inet6_dev *idev)
{
if (!idev || !idev->dev)
@@ -4449,7 +4464,7 @@ static struct addrconf_sysctl_table
.data = &ipv6_devconf.mtu6,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec,
+ .proc_handler = addrconf_sysctl_mtu,
},
{
.procname = "accept_ra",
--
1.9.1

2016-03-16 08:12:55

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 092/107] net: avoid to hang up on sending due to sysctl configuration overflow.

From: "[email protected]" <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit cdda88912d62f9603d27433338a18be83ef23ac1 upstream.

I found if we write a larger than 4GB value to some sysctl
variables, the sending syscall will hang up forever, because these
variables are 32 bits, such large values make them overflow to 0 or
negative.

This patch try to fix overflow or prevent from zero value setup
of below sysctl variables:

net.core.wmem_default
net.core.rmem_default

net.core.rmem_max
net.core.wmem_max

net.ipv4.udp_rmem_min
net.ipv4.udp_wmem_min

net.ipv4.tcp_wmem
net.ipv4.tcp_rmem

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Li Yu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/core/sysctl_net_core.c | 14 ++++++++++----
net/ipv4/sysctl_net_ipv4.c | 11 +++++++----
2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 77d1550..c04dadd 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -22,6 +22,8 @@
static int zero = 0;
static int ushort_max = USHRT_MAX;

+static int one = 1;
+
#ifdef CONFIG_RPS
static int rps_sock_flow_sysctl(ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
@@ -94,28 +96,32 @@ static struct ctl_table net_core_table[] = {
.data = &sysctl_wmem_max,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
},
{
.procname = "rmem_max",
.data = &sysctl_rmem_max,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
},
{
.procname = "wmem_default",
.data = &sysctl_wmem_default,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
},
{
.procname = "rmem_default",
.data = &sysctl_rmem_default,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
},
{
.procname = "dev_weight",
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 086c973..009e36d 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -27,6 +27,7 @@
#include <net/tcp_memcontrol.h>

static int zero;
+static int one = 1;
static int tcp_retr1_max = 255;
static int ip_local_port_range_min[] = { 1, 1 };
static int ip_local_port_range_max[] = { 65535, 65535 };
@@ -486,14 +487,16 @@ static struct ctl_table ipv4_table[] = {
.data = &sysctl_tcp_wmem,
.maxlen = sizeof(sysctl_tcp_wmem),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
},
{
.procname = "tcp_rmem",
.data = &sysctl_tcp_rmem,
.maxlen = sizeof(sysctl_tcp_rmem),
.mode = 0644,
- .proc_handler = proc_dointvec
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &one,
},
{
.procname = "tcp_app_win",
@@ -700,7 +703,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_rmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &zero
+ .extra1 = &one
},
{
.procname = "udp_wmem_min",
@@ -708,7 +711,7 @@ static struct ctl_table ipv4_table[] = {
.maxlen = sizeof(sysctl_udp_wmem_min),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
- .extra1 = &zero
+ .extra1 = &one
},
{ }
};
--
1.9.1

2016-03-16 08:13:06

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 095/107] atm: deal with setting entry before mkip was called

From: Sasha Levin <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 34f5b0066435ffb793049b84fafd29fa195bcf90 upstream.

If we didn't call ATMARP_MKIP before ATMARP_ENCAP the VCC descriptor is
non-existant and we'll end up dereferencing a NULL ptr:

[1033173.491930] kasan: GPF could be caused by NULL-ptr deref or user memory accessirq event stamp: 123386
[1033173.493678] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[1033173.493689] Modules linked in:
[1033173.493697] CPU: 9 PID: 23815 Comm: trinity-c64 Not tainted 4.2.0-next-20150911-sasha-00043-g353d875-dirty #2545
[1033173.493706] task: ffff8800630c4000 ti: ffff880063110000 task.ti: ffff880063110000
[1033173.493823] RIP: clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
[1033173.493826] RSP: 0018:ffff880063117a88 EFLAGS: 00010203
[1033173.493828] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 000000000000000c
[1033173.493830] RDX: 0000000000000002 RSI: ffffffffb3f10720 RDI: 0000000000000014
[1033173.493832] RBP: ffff880063117b80 R08: ffff88047574d9a4 R09: 0000000000000000
[1033173.493834] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1000c622f53
[1033173.493836] R13: ffff8800cb905500 R14: ffff8808d6da2000 R15: 00000000fffffdfd
[1033173.493840] FS: 00007fa56b92d700(0000) GS:ffff880478000000(0000) knlGS:0000000000000000
[1033173.493843] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[1033173.493845] CR2: 0000000000000000 CR3: 00000000630e8000 CR4: 00000000000006a0
[1033173.493855] Stack:
[1033173.493862] ffffffffb0b60444 000000000000eaea 0000000041b58ab3 ffffffffb3c3ce32
[1033173.493867] ffffffffb0b6f3e0 ffffffffb0b60444 ffffffffb5ea2e50 1ffff1000c622f5e
[1033173.493873] ffff8800630c4cd8 00000000000ee09a ffffffffb3ec4888 ffffffffb5ea2de8
[1033173.493874] Call Trace:
[1033173.494108] do_vcc_ioctl (net/atm/ioctl.c:170)
[1033173.494113] vcc_ioctl (net/atm/ioctl.c:189)
[1033173.494116] svc_ioctl (net/atm/svc.c:605)
[1033173.494200] sock_do_ioctl (net/socket.c:874)
[1033173.494204] sock_ioctl (net/socket.c:958)
[1033173.494244] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
[1033173.494290] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
[1033173.494295] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
[1033173.494362] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 50 09 00 00 49 8b 9e 60 06 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 14 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 14 09 00
All code

========
0: fa cli
1: 48 c1 ea 03 shr $0x3,%rdx
5: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
9: 0f 85 50 09 00 00 jne 0x95f
f: 49 8b 9e 60 06 00 00 mov 0x660(%r14),%rbx
16: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
1d: fc ff df
20: 48 8d 7b 14 lea 0x14(%rbx),%rdi
24: 48 89 fa mov %rdi,%rdx
27: 48 c1 ea 03 shr $0x3,%rdx
2b:* 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction
2f: 48 89 fa mov %rdi,%rdx
32: 83 e2 07 and $0x7,%edx
35: 38 d0 cmp %dl,%al
37: 7f 08 jg 0x41
39: 84 c0 test %al,%al
3b: 0f 85 14 09 00 00 jne 0x955

Code starting with the faulting instruction
===========================================
0: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax
4: 48 89 fa mov %rdi,%rdx
7: 83 e2 07 and $0x7,%edx
a: 38 d0 cmp %dl,%al
c: 7f 08 jg 0x16
e: 84 c0 test %al,%al
10: 0f 85 14 09 00 00 jne 0x92a
[1033173.494366] RIP clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
[1033173.494368] RSP <ffff880063117a88>

Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/atm/clip.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/atm/clip.c b/net/atm/clip.c
index 8ae3a78..e55e664 100644
--- a/net/atm/clip.c
+++ b/net/atm/clip.c
@@ -317,6 +317,9 @@ static int clip_constructor(struct neighbour *neigh)

static int clip_encap(struct atm_vcc *vcc, int mode)
{
+ if (!CLIP_VCC(vcc))
+ return -EBADFD;
+
CLIP_VCC(vcc)->encap = mode;
return 0;
}
--
1.9.1

2016-03-16 08:13:14

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 099/107] net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration

From: Hannes Frederic Sowa <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 7bbadd2d1009575dad675afc16650ebb5aa10612 upstream.

Docbook does not like the definition of macros inside a field declaration
and adds a warning. Move the definition out.

Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument")
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
include/net/sock.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index e2073e0..ddc3737 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -328,8 +328,8 @@ struct sock {
sk_no_check : 2,
sk_userlocks : 4,
sk_protocol : 8,
-#define SK_PROTOCOL_MAX ((u8)~0U)
sk_type : 16;
+#define SK_PROTOCOL_MAX ((u8)~0U)
kmemcheck_bitfield_end(flags);
int sk_wmem_queued;
gfp_t sk_allocation;
--
1.9.1

2016-03-16 08:13:22

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 101/107] x86/LDT: Print the real LDT base address

From: Jan Beulich <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 0d430e3fb3f7cdc13c0d22078b820f682821b45a upstream.

This was meant to print base address and entry count; make it do so
again.

Fixes: 37868fe113ff "x86/ldt: Make modify_ldt synchronous"
Signed-off-by: Jan Beulich <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/kernel/process_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3ebca08..d5d7313 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -119,7 +119,7 @@ void release_thread(struct task_struct *dead_task)
if (dead_task->mm->context.ldt) {
printk("WARNING: dead process %8s still has LDT? <%p/%d>\n",
dead_task->comm,
- dead_task->mm->context.ldt,
+ dead_task->mm->context.ldt->entries,
dead_task->mm->context.ldt->size);
BUG();
}
--
1.9.1

2016-03-16 08:13:33

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 103/107] ALSA: tlv: add DECLARE_TLV_DB_RANGE()

From: Clemens Ladisch <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit bf1d1c9b6179faa3bc32cee882462bc8eebde25d upstream.

Add a DECLARE_TLV_DB_RANGE() macro so that dB range information
can be specified without having to count the items manually for
TLV_DB_RANGE_HEAD().

Signed-off-by: Clemens Ladisch <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
include/sound/tlv.h | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/include/sound/tlv.h b/include/sound/tlv.h
index 137d165..49e7bd6 100644
--- a/include/sound/tlv.h
+++ b/include/sound/tlv.h
@@ -71,6 +71,10 @@

/* dB range container */
/* Each item is: <min> <max> <TLV> */
+#define TLV_DB_RANGE_ITEM(...) \
+ TLV_ITEM(SNDRV_CTL_TLVT_DB_RANGE, __VA_ARGS__)
+#define DECLARE_TLV_DB_RANGE(name, ...) \
+ unsigned int name[] = { TLV_DB_RANGE_ITEM(__VA_ARGS__) }
/* The below assumes that each item TLV is 4 words like DB_SCALE or LINEAR */
#define TLV_DB_RANGE_HEAD(num) \
SNDRV_CTL_TLVT_DB_RANGE, 6 * (num) * sizeof(unsigned int)
--
1.9.1

2016-03-16 08:13:53

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 104/107] ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest DragonFly

From: Anssi Hannula <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 42e3121d90f42e57f6dbd6083dff2f57b3ec7daa upstream.

AudioQuest DragonFly DAC reports a volume control range of 0..50
(0x0000..0x0032) which in USB Audio means a range of 0 .. 0.2dB, which
is obviously incorrect and would cause software using the dB information
in e.g. volume sliders to have a massive volume difference in 100..102%
range.

Commit 2d1cb7f658fb ("ALSA: usb-audio: add dB range mapping for some
devices") added a dB range mapping for it with range 0..50 dB.

However, the actual volume mapping seems to be neither linear volume nor
linear dB scale, but instead quite close to the cubic mapping e.g.
alsamixer uses, with a range of approx. -53...0 dB.

Replace the previous quirk with a custom dB mapping based on some basic
output measurements, using a 10-item range TLV (which will still fit in
alsa-lib MAX_TLV_RANGE_SIZE).

Tested on AudioQuest DragonFly HW v1.2. The quirk is only applied if the
range is 0..50, so if this gets fixed/changed in later HW revisions it
will no longer be applied.

v2: incorporated Takashi Iwai's suggestion for the quirk application
method

Signed-off-by: Anssi Hannula <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
[lizf: Backoported to 3.4: use dev_info() instead of usb_audio_info()]
Signed-off-by: Zefan Li <[email protected]>
---
sound/usb/mixer.c | 2 ++
sound/usb/mixer_maps.c | 12 ------------
sound/usb/mixer_quirks.c | 37 +++++++++++++++++++++++++++++++++++++
sound/usb/mixer_quirks.h | 4 ++++
4 files changed, 43 insertions(+), 12 deletions(-)

diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c
index c419aa3..67a827d 100644
--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -1211,6 +1211,8 @@ static void build_feature_ctl(struct mixer_build *state, void *raw_desc,
break;
}

+ snd_usb_mixer_fu_apply_quirk(state->mixer, cval, unitid, kctl);
+
range = (cval->max - cval->min) / cval->res;
/* Are there devices with volume range more than 255? I use a bit more
* to be sure. 384 is a resolution magic number found on Logitech
diff --git a/sound/usb/mixer_maps.c b/sound/usb/mixer_maps.c
index 893b750..cb98040 100644
--- a/sound/usb/mixer_maps.c
+++ b/sound/usb/mixer_maps.c
@@ -319,13 +319,6 @@ static struct usbmix_name_map bose_companion5_map[] = {
{ 0 } /* terminator */
};

-/* Dragonfly DAC 1.2, the dB conversion factor is 1 instead of 256 */
-static struct usbmix_dB_map dragonfly_1_2_dB = {0, 5000};
-static struct usbmix_name_map dragonfly_1_2_map[] = {
- { 7, NULL, .dB = &dragonfly_1_2_dB },
- { 0 } /* terminator */
-};
-
/*
* Control map entries
*/
@@ -413,11 +406,6 @@ static struct usbmix_ctl_map usbmix_ctl_maps[] = {
.id = USB_ID(0x05a7, 0x1020),
.map = bose_companion5_map,
},
- {
- /* Dragonfly DAC 1.2 */
- .id = USB_ID(0x21b4, 0x0081),
- .map = dragonfly_1_2_map,
- },
{ 0 } /* terminator */
};

diff --git a/sound/usb/mixer_quirks.c b/sound/usb/mixer_quirks.c
index 040d101..21f4d44 100644
--- a/sound/usb/mixer_quirks.c
+++ b/sound/usb/mixer_quirks.c
@@ -34,6 +34,7 @@
#include <sound/control.h>
#include <sound/hwdep.h>
#include <sound/info.h>
+#include <sound/tlv.h>

#include "usbaudio.h"
#include "mixer.h"
@@ -682,3 +683,39 @@ void snd_usb_mixer_rc_memory_change(struct usb_mixer_interface *mixer,
}
}

+static void snd_dragonfly_quirk_db_scale(struct usb_mixer_interface *mixer,
+ struct snd_kcontrol *kctl)
+{
+ /* Approximation using 10 ranges based on output measurement on hw v1.2.
+ * This seems close to the cubic mapping e.g. alsamixer uses. */
+ static const DECLARE_TLV_DB_RANGE(scale,
+ 0, 1, TLV_DB_MINMAX_ITEM(-5300, -4970),
+ 2, 5, TLV_DB_MINMAX_ITEM(-4710, -4160),
+ 6, 7, TLV_DB_MINMAX_ITEM(-3884, -3710),
+ 8, 14, TLV_DB_MINMAX_ITEM(-3443, -2560),
+ 15, 16, TLV_DB_MINMAX_ITEM(-2475, -2324),
+ 17, 19, TLV_DB_MINMAX_ITEM(-2228, -2031),
+ 20, 26, TLV_DB_MINMAX_ITEM(-1910, -1393),
+ 27, 31, TLV_DB_MINMAX_ITEM(-1322, -1032),
+ 32, 40, TLV_DB_MINMAX_ITEM(-968, -490),
+ 41, 50, TLV_DB_MINMAX_ITEM(-441, 0),
+ );
+
+ dev_info(&mixer->chip->dev->dev, "applying DragonFly dB scale quirk\n");
+ kctl->tlv.p = scale;
+ kctl->vd[0].access |= SNDRV_CTL_ELEM_ACCESS_TLV_READ;
+ kctl->vd[0].access &= ~SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK;
+}
+
+void snd_usb_mixer_fu_apply_quirk(struct usb_mixer_interface *mixer,
+ struct usb_mixer_elem_info *cval, int unitid,
+ struct snd_kcontrol *kctl)
+{
+ switch (mixer->chip->usb_id) {
+ case USB_ID(0x21b4, 0x0081): /* AudioQuest DragonFly */
+ if (unitid == 7 && cval->min == 0 && cval->max == 50)
+ snd_dragonfly_quirk_db_scale(mixer, kctl);
+ break;
+ }
+}
+
diff --git a/sound/usb/mixer_quirks.h b/sound/usb/mixer_quirks.h
index bdbfab0..177c329 100644
--- a/sound/usb/mixer_quirks.h
+++ b/sound/usb/mixer_quirks.h
@@ -9,5 +9,9 @@ void snd_emuusb_set_samplerate(struct snd_usb_audio *chip,
void snd_usb_mixer_rc_memory_change(struct usb_mixer_interface *mixer,
int unitid);

+void snd_usb_mixer_fu_apply_quirk(struct usb_mixer_interface *mixer,
+ struct usb_mixer_elem_info *cval, int unitid,
+ struct snd_kcontrol *kctl);
+
#endif /* SND_USB_MIXER_QUIRKS_H */

--
1.9.1

2016-03-16 08:13:57

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 106/107] dm btree remove: fix a bug when rebalancing nodes after removal

From: Joe Thornber <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 2871c69e025e8bc507651d5a9cf81a8a7da9d24b upstream.

Commit 4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't
a complete fix for redistribute3().

The redistribute3 function takes 3 btree nodes and shares out the entries
evenly between them. If the three nodes in total contained
(MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting
rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in
the center.

Fix this issue by being more careful about calculating the target number
of entries for the left and right nodes.

Unit tested in userspace using this program:
https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/persistent-data/dm-btree-remove.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/md/persistent-data/dm-btree-remove.c b/drivers/md/persistent-data/dm-btree-remove.c
index 7c0d755..92cd09f 100644
--- a/drivers/md/persistent-data/dm-btree-remove.c
+++ b/drivers/md/persistent-data/dm-btree-remove.c
@@ -301,11 +301,16 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
{
int s;
uint32_t max_entries = le32_to_cpu(left->header.max_entries);
- unsigned target = (nr_left + nr_center + nr_right) / 3;
- BUG_ON(target > max_entries);
+ unsigned total = nr_left + nr_center + nr_right;
+ unsigned target_right = total / 3;
+ unsigned remainder = (target_right * 3) != total;
+ unsigned target_left = target_right + remainder;
+
+ BUG_ON(target_left > max_entries);
+ BUG_ON(target_right > max_entries);

if (nr_left < nr_right) {
- s = nr_left - target;
+ s = nr_left - target_left;

if (s < 0 && nr_center < -s) {
/* not enough in central node */
@@ -316,10 +321,10 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
} else
shift(left, center, s);

- shift(center, right, target - nr_right);
+ shift(center, right, target_right - nr_right);

} else {
- s = target - nr_right;
+ s = target_right - nr_right;
if (s > 0 && nr_center < s) {
/* not enough in central node */
shift(center, right, nr_center);
@@ -329,7 +334,7 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
} else
shift(center, right, s);

- shift(left, center, nr_left - target);
+ shift(left, center, nr_left - target_left);
}

*key_ptr(parent, c->index) = center->keys[0];
--
1.9.1

2016-03-16 08:13:50

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 105/107] usb: dwc3: Fix assignment of EP transfer resources

From: John Youn <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit c450960187f45d4260db87c7dd4fc0bceb5565d8 upstream.

The assignement of EP transfer resources was not handled properly in the
dwc3 driver. Commit aebda6187181 ("usb: dwc3: Reset the transfer
resource index on SET_INTERFACE") previously fixed one aspect of this
where resources may be exhausted with multiple calls to SET_INTERFACE.
However, it introduced an issue where composite devices with multiple
interfaces can be assigned the same transfer resources for different
endpoints. This patch solves both issues.

The assignment of transfer resources cannot perfectly follow the data
book due to the fact that the controller driver does not have all
knowledge of the configuration in advance. It is given this information
piecemeal by the composite gadget framework after every
SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook
programming model in this scenario can cause errors. For two reasons:

1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION and
SET_INTERFACE (8.1.5). This is incorrect in the scenario of multiple
interfaces.

2) The databook does not mention doing more DEPXFERCFG for new endpoint
on alt setting (8.1.6).

The following simplified method is used instead:

All hardware endpoints can be assigned a transfer resource and this
setting will stay persistent until either a core reset or hibernation.
So whenever we do a DEPSTARTCFG(0) we can go ahead and do DEPXFERCFG for
every hardware endpoint as well. We are guaranteed that there are as
many transfer resources as endpoints.

This patch triggers off of the calling dwc3_gadget_start_config() for
EP0-out, which always happens first, and which should only happen in one
of the above conditions.

Fixes: aebda6187181 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE")
Reported-by: Ravi Babu <[email protected]>
Signed-off-by: John Youn <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/dwc3/core.h | 1 -
drivers/usb/dwc3/ep0.c | 5 ----
drivers/usb/dwc3/gadget.c | 70 +++++++++++++++++++++++++++++++++++------------
3 files changed, 52 insertions(+), 24 deletions(-)

diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 194cafd..9707601 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -614,7 +614,6 @@ struct dwc3 {
unsigned three_stage_setup:1;
unsigned ep0_bounced:1;
unsigned ep0_expect_in:1;
- unsigned start_config_issued:1;
unsigned setup_packet_pending:1;
unsigned delayed_status:1;
unsigned needs_fifo_resize:1;
diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
index 7c0eaeb..b6051f3 100644
--- a/drivers/usb/dwc3/ep0.c
+++ b/drivers/usb/dwc3/ep0.c
@@ -442,7 +442,6 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl)
u32 cfg;
int ret;

- dwc->start_config_issued = false;
cfg = le16_to_cpu(ctrl->wValue);

switch (dwc->dev_state) {
@@ -496,10 +495,6 @@ static int dwc3_ep0_std_request(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl)
dev_vdbg(dwc->dev, "USB_REQ_SET_CONFIGURATION\n");
ret = dwc3_ep0_set_config(dwc, ctrl);
break;
- case USB_REQ_SET_INTERFACE:
- dev_vdbg(dwc->dev ,"USB_REQ_SET_INTERFACE");
- dwc->start_config_issued = false;
- /* Fall through */
default:
dev_vdbg(dwc->dev, "Forwarding to gadget driver\n");
ret = dwc3_ep0_delegate_req(dwc, ctrl);
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index b43c6f9..fba74f8 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -359,24 +359,66 @@ static void dwc3_free_trb_pool(struct dwc3_ep *dep)
dep->trb_pool_dma = 0;
}

+static int dwc3_gadget_set_xfer_resource(struct dwc3 *dwc, struct dwc3_ep *dep);
+
+/**
+ * dwc3_gadget_start_config - Configure EP resources
+ * @dwc: pointer to our controller context structure
+ * @dep: endpoint that is being enabled
+ *
+ * The assignment of transfer resources cannot perfectly follow the
+ * data book due to the fact that the controller driver does not have
+ * all knowledge of the configuration in advance. It is given this
+ * information piecemeal by the composite gadget framework after every
+ * SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook
+ * programming model in this scenario can cause errors. For two
+ * reasons:
+ *
+ * 1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION
+ * and SET_INTERFACE (8.1.5). This is incorrect in the scenario of
+ * multiple interfaces.
+ *
+ * 2) The databook does not mention doing more DEPXFERCFG for new
+ * endpoint on alt setting (8.1.6).
+ *
+ * The following simplified method is used instead:
+ *
+ * All hardware endpoints can be assigned a transfer resource and this
+ * setting will stay persistent until either a core reset or
+ * hibernation. So whenever we do a DEPSTARTCFG(0) we can go ahead and
+ * do DEPXFERCFG for every hardware endpoint as well. We are
+ * guaranteed that there are as many transfer resources as endpoints.
+ *
+ * This function is called for each endpoint when it is being enabled
+ * but is triggered only when called for EP0-out, which always happens
+ * first, and which should only happen in one of the above conditions.
+ */
static int dwc3_gadget_start_config(struct dwc3 *dwc, struct dwc3_ep *dep)
{
struct dwc3_gadget_ep_cmd_params params;
u32 cmd;
+ int i;
+ int ret;
+
+ if (dep->number)
+ return 0;

memset(&params, 0x00, sizeof(params));
+ cmd = DWC3_DEPCMD_DEPSTARTCFG;

- if (dep->number != 1) {
- cmd = DWC3_DEPCMD_DEPSTARTCFG;
- /* XferRscIdx == 0 for ep0 and 2 for the remaining */
- if (dep->number > 1) {
- if (dwc->start_config_issued)
- return 0;
- dwc->start_config_issued = true;
- cmd |= DWC3_DEPCMD_PARAM(2);
- }
+ ret = dwc3_send_gadget_ep_cmd(dwc, 0, cmd, &params);
+ if (ret)
+ return ret;

- return dwc3_send_gadget_ep_cmd(dwc, 0, cmd, &params);
+ for (i = 0; i < DWC3_ENDPOINTS_NUM; i++) {
+ struct dwc3_ep *dep = dwc->eps[i];
+
+ if (!dep)
+ continue;
+
+ ret = dwc3_gadget_set_xfer_resource(dwc, dep);
+ if (ret)
+ return ret;
}

return 0;
@@ -471,10 +513,6 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep,
struct dwc3_trb *trb_st_hw;
struct dwc3_trb *trb_link;

- ret = dwc3_gadget_set_xfer_resource(dwc, dep);
- if (ret)
- return ret;
-
dep->desc = desc;
dep->comp_desc = comp_desc;
dep->type = usb_endpoint_type(desc);
@@ -1375,8 +1413,6 @@ static int dwc3_gadget_start(struct usb_gadget *g,
reg |= dwc->maximum_speed;
dwc3_writel(dwc->regs, DWC3_DCFG, reg);

- dwc->start_config_issued = false;
-
/* Start with SuperSpeed Default */
dwc3_gadget_ep0_desc.wMaxPacketSize = cpu_to_le16(512);

@@ -1861,7 +1897,6 @@ static void dwc3_gadget_disconnect_interrupt(struct dwc3 *dwc)

dwc3_stop_active_transfers(dwc);
dwc3_disconnect_gadget(dwc);
- dwc->start_config_issued = false;

dwc->gadget.speed = USB_SPEED_UNKNOWN;
dwc->setup_packet_pending = false;
@@ -1949,7 +1984,6 @@ static void dwc3_gadget_reset_interrupt(struct dwc3 *dwc)

dwc3_stop_active_transfers(dwc);
dwc3_clear_stall_all_ep(dwc);
- dwc->start_config_issued = false;

/* Reset device address to zero */
reg = dwc3_readl(dwc->regs, DWC3_DCFG);
--
1.9.1

2016-03-16 08:14:52

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 107/107] KVM: x86: move steal time initialization to vcpu entry time

From: Marcelo Tosatti <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 7cae2bedcbd4680b155999655e49c27b9cf020fa upstream.

As reported at https://bugs.launchpad.net/qemu/+bug/1494350,
it is possible to have vcpu->arch.st.last_steal initialized
from a thread other than vcpu thread, say the iothread, via
KVM_SET_MSRS.

Which can cause an overflow later (when subtracting from vcpu threads
sched_info.run_delay).

To avoid that, move steal time accumulation to vcpu entry time,
before copying steal time data to guest.

Signed-off-by: Marcelo Tosatti <[email protected]>
Reviewed-by: David Matlack <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/kvm/x86.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4ad2b7b..9cc83e2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1545,6 +1545,8 @@ static void accumulate_steal_time(struct kvm_vcpu *vcpu)

static void record_steal_time(struct kvm_vcpu *vcpu)
{
+ accumulate_steal_time(vcpu);
+
if (!(vcpu->arch.st.msr_val & KVM_MSR_ENABLED))
return;

@@ -1665,12 +1667,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
if (!(data & KVM_MSR_ENABLED))
break;

- vcpu->arch.st.last_steal = current->sched_info.run_delay;
-
- preempt_disable();
- accumulate_steal_time(vcpu);
- preempt_enable();
-
kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);

break;
@@ -2327,7 +2323,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
vcpu->cpu = cpu;
}

- accumulate_steal_time(vcpu);
kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
}

--
1.9.1

2016-03-16 08:15:59

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 102/107] ALSA: tlv: compute TLV_*_ITEM lengths automatically

From: Clemens Ladisch <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit b5b9eb546762c4015c67c31364a6ec6f83fd2ada upstream.

Add helper macros with a little bit of preprocessor magic to
automatically compute the length of a TLV item. This lets us avoid
having to compute this by hand, and will allow to use items that do
not use a fixed length.

Signed-off-by: Clemens Ladisch <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
include/sound/tlv.h | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/include/sound/tlv.h b/include/sound/tlv.h
index 7067e2d..137d165 100644
--- a/include/sound/tlv.h
+++ b/include/sound/tlv.h
@@ -38,21 +38,26 @@
#define SNDRV_CTL_TLVT_DB_MINMAX 4 /* dB scale with min/max */
#define SNDRV_CTL_TLVT_DB_MINMAX_MUTE 5 /* dB scale with min/max with mute */

+#define TLV_ITEM(type, ...) \
+ (type), TLV_LENGTH(__VA_ARGS__), __VA_ARGS__
+#define TLV_LENGTH(...) \
+ ((unsigned int)sizeof((const unsigned int[]) { __VA_ARGS__ }))
+
#define TLV_DB_SCALE_MASK 0xffff
#define TLV_DB_SCALE_MUTE 0x10000
#define TLV_DB_SCALE_ITEM(min, step, mute) \
- SNDRV_CTL_TLVT_DB_SCALE, 2 * sizeof(unsigned int), \
- (min), ((step) & TLV_DB_SCALE_MASK) | ((mute) ? TLV_DB_SCALE_MUTE : 0)
+ TLV_ITEM(SNDRV_CTL_TLVT_DB_SCALE, \
+ (min), \
+ ((step) & TLV_DB_SCALE_MASK) | \
+ ((mute) ? TLV_DB_SCALE_MUTE : 0))
#define DECLARE_TLV_DB_SCALE(name, min, step, mute) \
unsigned int name[] = { TLV_DB_SCALE_ITEM(min, step, mute) }

/* dB scale specified with min/max values instead of step */
#define TLV_DB_MINMAX_ITEM(min_dB, max_dB) \
- SNDRV_CTL_TLVT_DB_MINMAX, 2 * sizeof(unsigned int), \
- (min_dB), (max_dB)
+ TLV_ITEM(SNDRV_CTL_TLVT_DB_MINMAX, (min_dB), (max_dB))
#define TLV_DB_MINMAX_MUTE_ITEM(min_dB, max_dB) \
- SNDRV_CTL_TLVT_DB_MINMAX_MUTE, 2 * sizeof(unsigned int), \
- (min_dB), (max_dB)
+ TLV_ITEM(SNDRV_CTL_TLVT_DB_MINMAX_MUTE, (min_dB), (max_dB))
#define DECLARE_TLV_DB_MINMAX(name, min_dB, max_dB) \
unsigned int name[] = { TLV_DB_MINMAX_ITEM(min_dB, max_dB) }
#define DECLARE_TLV_DB_MINMAX_MUTE(name, min_dB, max_dB) \
@@ -60,8 +65,7 @@

/* linear volume between min_dB and max_dB (.01dB unit) */
#define TLV_DB_LINEAR_ITEM(min_dB, max_dB) \
- SNDRV_CTL_TLVT_DB_LINEAR, 2 * sizeof(unsigned int), \
- (min_dB), (max_dB)
+ TLV_ITEM(SNDRV_CTL_TLVT_DB_LINEAR, (min_dB), (max_dB))
#define DECLARE_TLV_DB_LINEAR(name, min_dB, max_dB) \
unsigned int name[] = { TLV_DB_LINEAR_ITEM(min_dB, max_dB) }

--
1.9.1

2016-03-16 08:16:29

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 100/107] af_unix: Guard against other == sk in unix_dgram_sendmsg

From: Rainer Weikusat <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit a5527dda344fff0514b7989ef7a755729769daa1 upstream.

The unix_dgram_sendmsg routine use the following test

if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {

to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.

Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue")
Reported-By: Philipp Hahn <[email protected]>
Signed-off-by: Rainer Weikusat <[email protected]>
Tested-by: Philipp Hahn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/unix/af_unix.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 9e120d7..de07f94 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1693,7 +1693,12 @@ restart_locked:
goto out_unlock;
}

- if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
+ /* other == sk && unix_peer(other) != sk if
+ * - unix_peer(sk) == NULL, destination address bound to sk
+ * - unix_peer(sk) == sk by time of get but disconnected before lock
+ */
+ if (other != sk &&
+ unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
if (timeo) {
timeo = unix_wait_for_peer(other, timeo);

--
1.9.1

2016-03-16 08:16:34

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 098/107] kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one

From: Ben Zhang <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 62572e29bc530b38921ef6059088b4788a9832a5 upstream.

I ran into a scenario where while one cpu was stuck and should have
panic'd because of the NMI watchdog, it didn't. The reason was another
cpu was spewing stack dumps on to the console. Upon investigation, I
noticed that when writing to the console and also when dumping the
stack, the watchdog is touched.

This causes all the cpus to reset their NMI watchdog flags and the
'stuck' cpu just spins forever.

This change causes the semantics of touch_nmi_watchdog to be changed
slightly. Previously, I accidentally changed the semantics and we
noticed there was a codepath in which touch_nmi_watchdog could be
touched from a preemtible area. That caused a BUG() to happen when
CONFIG_DEBUG_PREEMPT was enabled. I believe it was the acpi code.

My attempt here re-introduces the change to have the
touch_nmi_watchdog() code only touch the local cpu instead of all of the
cpus. But instead of using __get_cpu_var(), I use the
__raw_get_cpu_var() version.

This avoids the preemption problem. However my reasoning wasn't because
I was trying to be lazy. Instead I rationalized it as, well if
preemption is enabled then interrupts should be enabled to and the NMI
watchdog will have no reason to trigger. So it won't matter if the
wrong cpu is touched because the percpu interrupt counters the NMI
watchdog uses should still be incrementing.

Don said:

: I'm ok with this patch, though it does alter the behaviour of how
: touch_nmi_watchdog works. For the most part I don't think most callers
: need to touch all of the watchdogs (on each cpu). Perhaps a corner case
: will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ).
:
: But this does address an issue where if a system is locked up and one cpu
: is spewing out useful debug messages (or error messages), the hard lockup
: will fail to go off. We have seen this on RHEL also.

Signed-off-by: Don Zickus <[email protected]>
Signed-off-by: Ben Zhang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
kernel/watchdog.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 991aa93..7527c8c 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -162,6 +162,14 @@ void touch_nmi_watchdog(void)
per_cpu(watchdog_nmi_touch, cpu) = true;
}
}
+ /*
+ * Using __raw here because some code paths have
+ * preemption enabled. If preemption is enabled
+ * then interrupts should be enabled too, in which
+ * case we shouldn't have to worry about the watchdog
+ * going off.
+ */
+ __raw_get_cpu_var(watchdog_nmi_touch) = true;
touch_softlockup_watchdog();
}
EXPORT_SYMBOL(touch_nmi_watchdog);
--
1.9.1

2016-03-16 08:17:15

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 096/107] SUNRPC: never enqueue a ->rq_cong request on ->sending

From: Neil Brown <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 298073181112a6ab6c30fe7971b99de968daf81e upstream.

If the sending queue has a task without ->rq_cong set at the front,
and then a number of tasks with ->rq_cong set such that they use
the entire congestion window, then the queue deadlocks. The first
entry cannot be processed until later entries complete.

This scenario has been seen with a client using UDP to access a server,
and the network connection breaking for a period of time - it doesn't
recover.

It never really makes sense for an ->rq_cong request to be on the ->sending
queue, but it can happen when a request is being retried, and finds
the transport if locked (XPRT_LOCKED). In this case we simple call
__xprt_put_cong() and the deadlock goes away.

Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/sunrpc/xprt.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index f1a63c1..fe57284 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -66,6 +66,7 @@ static void xprt_init(struct rpc_xprt *xprt, struct net *net);
static void xprt_request_init(struct rpc_task *, struct rpc_xprt *);
static void xprt_connect_status(struct rpc_task *task);
static int __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
+static void __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);
static void xprt_destroy(struct rpc_xprt *xprt);

static DEFINE_SPINLOCK(xprt_list_lock);
@@ -269,6 +270,8 @@ int xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task)
}
xprt_clear_locked(xprt);
out_sleep:
+ if (req)
+ __xprt_put_cong(xprt, req);
dprintk("RPC: %5u failed to lock transport %p\n", task->tk_pid, xprt);
task->tk_timeout = 0;
task->tk_status = -EAGAIN;
--
1.9.1

2016-03-16 08:17:17

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 094/107] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

From: Andrey Vagin <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1 task 2 task 3
nf_conntrack_find_get
____nf_conntrack_find
destroy_conntrack
hlist_nulls_del_rcu
nf_conntrack_free
kmem_cache_free
__nf_conntrack_alloc
kmem_cache_alloc
memset(&ct->tuplehash[IP_CT_DIR_MAX],
if (nf_ct_is_dying(ct))
if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>] [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622] [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697] [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770] [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843] [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919] [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063] [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207] [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277] [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346] [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419] [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491] [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562] [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638] [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712] [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785] [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858] [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081] [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151] [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229] [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303] [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378] [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454] [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531] [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet <[email protected]>
Cc: Florian Westphal <[email protected]>
Cc: Pablo Neira Ayuso <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: Jozsef Kadlecsik <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Signed-off-by: Andrey Vagin <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/netfilter/nf_conntrack_core.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 9a171b2..71935fc 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -309,6 +309,21 @@ static void death_by_timeout(unsigned long ul_conntrack)
nf_ct_put(ct);
}

+static inline bool
+nf_ct_key_equal(struct nf_conntrack_tuple_hash *h,
+ const struct nf_conntrack_tuple *tuple,
+ u16 zone)
+{
+ struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+
+ /* A conntrack can be recreated with the equal tuple,
+ * so we need to check that the conntrack is confirmed
+ */
+ return nf_ct_tuple_equal(tuple, &h->tuple) &&
+ nf_ct_zone(ct) == zone &&
+ nf_ct_is_confirmed(ct);
+}
+
/*
* Warning :
* - Caller must take a reference on returned object
@@ -330,8 +345,7 @@ ____nf_conntrack_find(struct net *net, u16 zone,
local_bh_disable();
begin:
hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[bucket], hnnode) {
- if (nf_ct_tuple_equal(tuple, &h->tuple) &&
- nf_ct_zone(nf_ct_tuplehash_to_ctrack(h)) == zone) {
+ if (nf_ct_key_equal(h, tuple, zone)) {
NF_CT_STAT_INC(net, found);
local_bh_enable();
return h;
@@ -378,8 +392,7 @@ begin:
!atomic_inc_not_zero(&ct->ct_general.use)))
h = NULL;
else {
- if (unlikely(!nf_ct_tuple_equal(tuple, &h->tuple) ||
- nf_ct_zone(ct) != zone)) {
+ if (unlikely(!nf_ct_key_equal(h, tuple, zone))) {
nf_ct_put(ct);
goto begin;
}
--
1.9.1

2016-03-16 08:17:12

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 097/107] ipv6: prevent fib6_run_gc() contention

From: Michal Kubeček <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 2ac3ac8f86f2fe065d746d9a9abaca867adec577 upstream.

On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.

This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.

Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.

Signed-off-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
include/net/ip6_fib.h | 2 +-
net/ipv6/ip6_fib.c | 19 ++++++++-----------
net/ipv6/ndisc.c | 4 ++--
net/ipv6/route.c | 4 ++--
4 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 0ae759a..49c4cfe 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -266,7 +266,7 @@ extern void inet6_rt_notify(int event, struct rt6_info *rt,
struct nl_info *info);

extern void fib6_run_gc(unsigned long expires,
- struct net *net);
+ struct net *net, bool force);

extern void fib6_gc_cleanup(void);

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 2cfcfb7..fc5ce6e 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1593,19 +1593,16 @@ static int fib6_age(struct rt6_info *rt, void *arg)

static DEFINE_SPINLOCK(fib6_gc_lock);

-void fib6_run_gc(unsigned long expires, struct net *net)
+void fib6_run_gc(unsigned long expires, struct net *net, bool force)
{
- if (expires != ~0UL) {
+ if (force) {
spin_lock_bh(&fib6_gc_lock);
- gc_args.timeout = expires ? (int)expires :
- net->ipv6.sysctl.ip6_rt_gc_interval;
- } else {
- if (!spin_trylock_bh(&fib6_gc_lock)) {
- mod_timer(&net->ipv6.ip6_fib_timer, jiffies + HZ);
- return;
- }
- gc_args.timeout = net->ipv6.sysctl.ip6_rt_gc_interval;
+ } else if (!spin_trylock_bh(&fib6_gc_lock)) {
+ mod_timer(&net->ipv6.ip6_fib_timer, jiffies + HZ);
+ return;
}
+ gc_args.timeout = expires ? (int)expires :
+ net->ipv6.sysctl.ip6_rt_gc_interval;

gc_args.more = icmp6_dst_gc();

@@ -1622,7 +1619,7 @@ void fib6_run_gc(unsigned long expires, struct net *net)

static void fib6_gc_timer_cb(unsigned long arg)
{
- fib6_run_gc(0, (struct net *)arg);
+ fib6_run_gc(0, (struct net *)arg, true);
}

static int __net_init fib6_net_init(struct net *net)
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 5cc78e6..e235b4c 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1737,11 +1737,11 @@ static int ndisc_netdev_event(struct notifier_block *this, unsigned long event,
switch (event) {
case NETDEV_CHANGEADDR:
neigh_changeaddr(&nd_tbl, dev);
- fib6_run_gc(~0UL, net);
+ fib6_run_gc(0, net, false);
break;
case NETDEV_DOWN:
neigh_ifdown(&nd_tbl, dev);
- fib6_run_gc(~0UL, net);
+ fib6_run_gc(0, net, false);
break;
case NETDEV_NOTIFY_PEERS:
ndisc_send_unsol_na(dev);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c909217..7ab7f8a 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1245,7 +1245,7 @@ static int ip6_dst_gc(struct dst_ops *ops)
goto out;

net->ipv6.ip6_rt_gc_expire++;
- fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net);
+ fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, entries > rt_max_size);
net->ipv6.ip6_rt_last_gc = now;
entries = dst_entries_get_slow(ops);
if (entries < ops->gc_thresh)
@@ -2840,7 +2840,7 @@ int ipv6_sysctl_rtcache_flush(ctl_table *ctl, int write,
net = (struct net *)ctl->extra1;
delay = net->ipv6.sysctl.flush_delay;
proc_dointvec(ctl, write, buffer, lenp, ppos);
- fib6_run_gc(delay <= 0 ? ~0UL : (unsigned long)delay, net);
+ fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0);
return 0;
}

--
1.9.1

2016-03-16 08:18:15

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 091/107] Initialize msg/shm IPC objects before doing ipc_addid()

From: Linus Torvalds <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit b9a532277938798b53178d5a66af6e2915cb27cf upstream.

As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
having initialized the IPC object state. Yes, we initialize the IPC
object in a locked state, but with all the lockless RCU lookup work,
that IPC object lock no longer means that the state cannot be seen.

We already did this for the IPC semaphore code (see commit e8577d1f0329:
"ipc/sem.c: fully initialize sem_array before making it visible") but we
clearly forgot about msg and shm.

Reported-by: Dmitry Vyukov <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
ipc/msg.c | 18 +++++++++---------
ipc/shm.c | 13 +++++++------
ipc/util.c | 8 ++++----
3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 25f1a61..391e3e0 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -198,6 +198,15 @@ static int newque(struct ipc_namespace *ns, struct ipc_params *params)
return retval;
}

+ msq->q_stime = msq->q_rtime = 0;
+ msq->q_ctime = get_seconds();
+ msq->q_cbytes = msq->q_qnum = 0;
+ msq->q_qbytes = ns->msg_ctlmnb;
+ msq->q_lspid = msq->q_lrpid = 0;
+ INIT_LIST_HEAD(&msq->q_messages);
+ INIT_LIST_HEAD(&msq->q_receivers);
+ INIT_LIST_HEAD(&msq->q_senders);
+
/*
* ipc_addid() locks msq
*/
@@ -208,15 +217,6 @@ static int newque(struct ipc_namespace *ns, struct ipc_params *params)
return id;
}

- msq->q_stime = msq->q_rtime = 0;
- msq->q_ctime = get_seconds();
- msq->q_cbytes = msq->q_qnum = 0;
- msq->q_qbytes = ns->msg_ctlmnb;
- msq->q_lspid = msq->q_lrpid = 0;
- INIT_LIST_HEAD(&msq->q_messages);
- INIT_LIST_HEAD(&msq->q_receivers);
- INIT_LIST_HEAD(&msq->q_senders);
-
msg_unlock(msq);

return msq->q_perm.id;
diff --git a/ipc/shm.c b/ipc/shm.c
index a02ef57..634b0ba 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -500,12 +500,6 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
if (IS_ERR(file))
goto no_file;

- id = ipc_addid(&shm_ids(ns), &shp->shm_perm, ns->shm_ctlmni);
- if (id < 0) {
- error = id;
- goto no_id;
- }
-
shp->shm_cprid = task_tgid_vnr(current);
shp->shm_lprid = 0;
shp->shm_atim = shp->shm_dtim = 0;
@@ -514,6 +508,13 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
shp->shm_nattch = 0;
shp->shm_file = file;
shp->shm_creator = current;
+
+ id = ipc_addid(&shm_ids(ns), &shp->shm_perm, ns->shm_ctlmni);
+ if (id < 0) {
+ error = id;
+ goto no_id;
+ }
+
/*
* shmid gets reported as "inode#" in /proc/pid/maps.
* proc-ps tools use this. Changing this will break them.
diff --git a/ipc/util.c b/ipc/util.c
index 75261a3..e4c9377 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -264,6 +264,10 @@ int ipc_addid(struct ipc_ids* ids, struct kern_ipc_perm* new, int size)
rcu_read_lock();
spin_lock(&new->lock);

+ current_euid_egid(&euid, &egid);
+ new->cuid = new->uid = euid;
+ new->gid = new->cgid = egid;
+
err = idr_get_new(&ids->ipcs_idr, new, &id);
if (err) {
spin_unlock(&new->lock);
@@ -273,10 +277,6 @@ int ipc_addid(struct ipc_ids* ids, struct kern_ipc_perm* new, int size)

ids->in_use++;

- current_euid_egid(&euid, &egid);
- new->cuid = new->uid = euid;
- new->gid = new->cgid = egid;
-
new->seq = ids->seq++;
if(ids->seq > ids->seq_max)
ids->seq = 0;
--
1.9.1

2016-03-16 08:18:14

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 093/107] ipv6: probe routes asynchronous in rt6_probe

From: Hannes Frederic Sowa <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit c2f17e827b419918c856131f592df9521e1a38e3 upstream.

Routes need to be probed asynchronous otherwise the call stack gets
exhausted when the kernel attemps to deliver another skb inline, like
e.g. xt_TEE does, and we probe at the same time.

We update neigh->updated still at once, otherwise we would send to
many probes.

Cc: Julian Anastasov <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/ipv6/route.c | 37 +++++++++++++++++++++++++++++++------
1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 23b3304..c909217 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -404,6 +404,24 @@ out:
}

#ifdef CONFIG_IPV6_ROUTER_PREF
+struct __rt6_probe_work {
+ struct work_struct work;
+ struct in6_addr target;
+ struct net_device *dev;
+};
+
+static void rt6_probe_deferred(struct work_struct *w)
+{
+ struct in6_addr mcaddr;
+ struct __rt6_probe_work *work =
+ container_of(w, struct __rt6_probe_work, work);
+
+ addrconf_addr_solict_mult(&work->target, &mcaddr);
+ ndisc_send_ns(work->dev, NULL, &work->target, &mcaddr, NULL);
+ dev_put(work->dev);
+ kfree(w);
+}
+
static void rt6_probe(struct rt6_info *rt)
{
struct neighbour *neigh;
@@ -422,15 +440,22 @@ static void rt6_probe(struct rt6_info *rt)
read_lock_bh(&neigh->lock);
if (!(neigh->nud_state & NUD_VALID) &&
time_after(jiffies, neigh->updated + rt->rt6i_idev->cnf.rtr_probe_interval)) {
- struct in6_addr mcaddr;
- struct in6_addr *target;
+ struct __rt6_probe_work *work;
+
+ work = kmalloc(sizeof(*work), GFP_ATOMIC);
+
+ if (work)
+ neigh->updated = jiffies;

- neigh->updated = jiffies;
read_unlock_bh(&neigh->lock);

- target = (struct in6_addr *)&neigh->primary_key;
- addrconf_addr_solict_mult(target, &mcaddr);
- ndisc_send_ns(rt->dst.dev, NULL, target, &mcaddr, NULL);
+ if (work) {
+ INIT_WORK(&work->work, rt6_probe_deferred);
+ work->target = rt->rt6i_gateway;
+ dev_hold(rt->dst.dev);
+ work->dev = rt->dst.dev;
+ schedule_work(&work->work);
+ }
} else {
read_unlock_bh(&neigh->lock);
}
--
1.9.1

2016-03-16 08:18:52

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 090/107] get rid of s_files and files_lock

From: Al Viro <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit eee5cc2702929fd41cce28058dc6d6717f723f87 upstream.

The only thing we need it for is alt-sysrq-r (emergency remount r/o)
and these days we can do just as well without going through the
list of files.

Signed-off-by: Al Viro <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
fs/file_table.c | 130 -----------------------------------------------------
fs/internal.h | 3 --
fs/open.c | 2 -
fs/super.c | 21 +--------
include/linux/fs.h | 13 ------
5 files changed, 2 insertions(+), 167 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 70f2a0f..a01710a 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -34,9 +34,6 @@ struct files_stat_struct files_stat = {
.max_files = NR_FILE
};

-DECLARE_LGLOCK(files_lglock);
-DEFINE_LGLOCK(files_lglock);
-
/* SLAB cache for file structures */
static struct kmem_cache *filp_cachep __read_mostly;

@@ -129,7 +126,6 @@ struct file *get_empty_filp(void)
if (security_file_alloc(f))
goto fail_sec;

- INIT_LIST_HEAD(&f->f_u.fu_list);
atomic_long_set(&f->f_count, 1);
rwlock_init(&f->f_owner.lock);
spin_lock_init(&f->f_lock);
@@ -252,7 +248,6 @@ static void __fput(struct file *file)
}
fops_put(file->f_op);
put_pid(file->f_owner.pid);
- file_sb_list_del(file);
if ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
i_readcount_dec(inode);
if (file->f_mode & FMODE_WRITE)
@@ -382,134 +377,10 @@ void put_filp(struct file *file)
{
if (atomic_long_dec_and_test(&file->f_count)) {
security_file_free(file);
- file_sb_list_del(file);
file_free(file);
}
}

-static inline int file_list_cpu(struct file *file)
-{
-#ifdef CONFIG_SMP
- return file->f_sb_list_cpu;
-#else
- return smp_processor_id();
-#endif
-}
-
-/* helper for file_sb_list_add to reduce ifdefs */
-static inline void __file_sb_list_add(struct file *file, struct super_block *sb)
-{
- struct list_head *list;
-#ifdef CONFIG_SMP
- int cpu;
- cpu = smp_processor_id();
- file->f_sb_list_cpu = cpu;
- list = per_cpu_ptr(sb->s_files, cpu);
-#else
- list = &sb->s_files;
-#endif
- list_add(&file->f_u.fu_list, list);
-}
-
-/**
- * file_sb_list_add - add a file to the sb's file list
- * @file: file to add
- * @sb: sb to add it to
- *
- * Use this function to associate a file with the superblock of the inode it
- * refers to.
- */
-void file_sb_list_add(struct file *file, struct super_block *sb)
-{
- lg_local_lock(files_lglock);
- __file_sb_list_add(file, sb);
- lg_local_unlock(files_lglock);
-}
-
-/**
- * file_sb_list_del - remove a file from the sb's file list
- * @file: file to remove
- * @sb: sb to remove it from
- *
- * Use this function to remove a file from its superblock.
- */
-void file_sb_list_del(struct file *file)
-{
- if (!list_empty(&file->f_u.fu_list)) {
- lg_local_lock_cpu(files_lglock, file_list_cpu(file));
- list_del_init(&file->f_u.fu_list);
- lg_local_unlock_cpu(files_lglock, file_list_cpu(file));
- }
-}
-
-#ifdef CONFIG_SMP
-
-/*
- * These macros iterate all files on all CPUs for a given superblock.
- * files_lglock must be held globally.
- */
-#define do_file_list_for_each_entry(__sb, __file) \
-{ \
- int i; \
- for_each_possible_cpu(i) { \
- struct list_head *list; \
- list = per_cpu_ptr((__sb)->s_files, i); \
- list_for_each_entry((__file), list, f_u.fu_list)
-
-#define while_file_list_for_each_entry \
- } \
-}
-
-#else
-
-#define do_file_list_for_each_entry(__sb, __file) \
-{ \
- struct list_head *list; \
- list = &(sb)->s_files; \
- list_for_each_entry((__file), list, f_u.fu_list)
-
-#define while_file_list_for_each_entry \
-}
-
-#endif
-
-/**
- * mark_files_ro - mark all files read-only
- * @sb: superblock in question
- *
- * All files are marked read-only. We don't care about pending
- * delete files so this should be used in 'force' mode only.
- */
-void mark_files_ro(struct super_block *sb)
-{
- struct file *f;
-
-retry:
- lg_global_lock(files_lglock);
- do_file_list_for_each_entry(sb, f) {
- struct vfsmount *mnt;
- if (!S_ISREG(f->f_path.dentry->d_inode->i_mode))
- continue;
- if (!file_count(f))
- continue;
- if (!(f->f_mode & FMODE_WRITE))
- continue;
- spin_lock(&f->f_lock);
- f->f_mode &= ~FMODE_WRITE;
- spin_unlock(&f->f_lock);
- if (file_check_writeable(f) != 0)
- continue;
- file_release_write(f);
- mnt = mntget(f->f_path.mnt);
- /* This can sleep, so we can't hold the spinlock. */
- lg_global_unlock(files_lglock);
- mnt_drop_write(mnt);
- mntput(mnt);
- goto retry;
- } while_file_list_for_each_entry;
- lg_global_unlock(files_lglock);
-}
-
void __init files_init(unsigned long mempages)
{
unsigned long n;
@@ -525,6 +396,5 @@ void __init files_init(unsigned long mempages)
n = (mempages * (PAGE_SIZE / 1024)) / 10;
files_stat.max_files = max_t(unsigned long, n, NR_FILE);
files_defer_init();
- lg_lock_init(files_lglock);
percpu_counter_init(&nr_files, 0);
}
diff --git a/fs/internal.h b/fs/internal.h
index 9962c59..ed005c5 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -67,9 +67,6 @@ extern void chroot_fs_refs(struct path *, struct path *);
/*
* file_table.c
*/
-extern void file_sb_list_add(struct file *f, struct super_block *sb);
-extern void file_sb_list_del(struct file *f);
-extern void mark_files_ro(struct super_block *);
extern struct file *get_empty_filp(void);

/*
diff --git a/fs/open.c b/fs/open.c
index cf1d34f..703b051 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -672,7 +672,6 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
f->f_path.dentry = dentry;
f->f_path.mnt = mnt;
f->f_pos = 0;
- file_sb_list_add(f, inode->i_sb);

if (unlikely(f->f_mode & FMODE_PATH)) {
f->f_op = &empty_fops;
@@ -730,7 +729,6 @@ cleanup_all:
mnt_drop_write(mnt);
}
}
- file_sb_list_del(f);
f->f_path.dentry = NULL;
f->f_path.mnt = NULL;
cleanup_file:
diff --git a/fs/super.c b/fs/super.c
index d0154e5..36fbe27 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -122,22 +122,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
s = NULL;
goto out;
}
-#ifdef CONFIG_SMP
- s->s_files = alloc_percpu(struct list_head);
- if (!s->s_files) {
- security_sb_free(s);
- kfree(s);
- s = NULL;
- goto out;
- } else {
- int i;

- for_each_possible_cpu(i)
- INIT_LIST_HEAD(per_cpu_ptr(s->s_files, i));
- }
-#else
- INIT_LIST_HEAD(&s->s_files);
-#endif
s->s_bdi = &default_backing_dev_info;
INIT_HLIST_NODE(&s->s_instances);
INIT_HLIST_BL_HEAD(&s->s_anon);
@@ -200,9 +185,6 @@ out:
*/
static inline void destroy_super(struct super_block *s)
{
-#ifdef CONFIG_SMP
- free_percpu(s->s_files);
-#endif
security_sb_free(s);
WARN_ON(!list_empty(&s->s_mounts));
kfree(s->s_subtype);
@@ -744,7 +726,8 @@ int do_remount_sb(struct super_block *sb, int flags, void *data, int force)
make sure there are no rw files opened */
if (remount_ro) {
if (force) {
- mark_files_ro(sb);
+ sb->s_readonly_remount = 1;
+ smp_wmb();
} else {
retval = sb_prepare_remount_readonly(sb);
if (retval)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 210c347..e7bbe99 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -976,12 +976,7 @@ static inline int ra_has_index(struct file_ra_state *ra, pgoff_t index)
#define FILE_MNT_WRITE_RELEASED 2

struct file {
- /*
- * fu_list becomes invalid after file_free is called and queued via
- * fu_rcuhead for RCU freeing
- */
union {
- struct list_head fu_list;
struct rcu_head fu_rcuhead;
} f_u;
struct path f_path;
@@ -994,9 +989,6 @@ struct file {
* Must not be taken from IRQ context.
*/
spinlock_t f_lock;
-#ifdef CONFIG_SMP
- int f_sb_list_cpu;
-#endif
atomic_long_t f_count;
unsigned int f_flags;
fmode_t f_mode;
@@ -1443,11 +1435,6 @@ struct super_block {

struct list_head s_inodes; /* all inodes */
struct hlist_bl_head s_anon; /* anonymous dentries for (nfs) exporting */
-#ifdef CONFIG_SMP
- struct list_head __percpu *s_files;
-#else
- struct list_head s_files;
-#endif
struct list_head s_mounts; /* list of mounts; _not_ for fs use */
/* s_dentry_lru, s_nr_dentry_unused protected by dcache.c lru locks */
struct list_head s_dentry_lru; /* unused dentry lru */
--
1.9.1

2016-03-16 08:19:20

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 089/107] KVM: svm: unconditionally intercept #DB

From: Paolo Bonzini <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit cbdb967af3d54993f5814f1cee0ed311a055377d upstream.

This is needed to avoid the possibility that the guest triggers
an infinite stream of #DB exceptions (CVE-2015-8104).

VMX is not affected: because it does not save DR6 in the VMCS,
it already intercepts #DB unconditionally.

Reported-by: Jan Beulich <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
[bwh: Backported to 3.2, with thanks to Paolo:
- update_db_bp_intercept() was called update_db_intercept()
- The remaining call is in svm_guest_debug() rather than through svm_x86_ops]
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/kvm/svm.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 56dd88a..6201ca0 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1082,6 +1082,7 @@ static void init_vmcb(struct vcpu_svm *svm)
set_exception_intercept(svm, UD_VECTOR);
set_exception_intercept(svm, MC_VECTOR);
set_exception_intercept(svm, AC_VECTOR);
+ set_exception_intercept(svm, DB_VECTOR);

set_intercept(svm, INTERCEPT_INTR);
set_intercept(svm, INTERCEPT_NMI);
@@ -1637,20 +1638,13 @@ static void svm_set_segment(struct kvm_vcpu *vcpu,
mark_dirty(svm->vmcb, VMCB_SEG);
}

-static void update_db_intercept(struct kvm_vcpu *vcpu)
+static void update_bp_intercept(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);

- clr_exception_intercept(svm, DB_VECTOR);
clr_exception_intercept(svm, BP_VECTOR);

- if (svm->nmi_singlestep)
- set_exception_intercept(svm, DB_VECTOR);
-
if (vcpu->guest_debug & KVM_GUESTDBG_ENABLE) {
- if (vcpu->guest_debug &
- (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))
- set_exception_intercept(svm, DB_VECTOR);
if (vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP)
set_exception_intercept(svm, BP_VECTOR);
} else
@@ -1668,7 +1662,7 @@ static void svm_guest_debug(struct kvm_vcpu *vcpu, struct kvm_guest_debug *dbg)

mark_dirty(svm->vmcb, VMCB_DR);

- update_db_intercept(vcpu);
+ update_bp_intercept(vcpu);
}

static void new_asid(struct vcpu_svm *svm, struct svm_cpu_data *sd)
@@ -1742,7 +1736,6 @@ static int db_interception(struct vcpu_svm *svm)
if (!(svm->vcpu.guest_debug & KVM_GUESTDBG_SINGLESTEP))
svm->vmcb->save.rflags &=
~(X86_EFLAGS_TF | X86_EFLAGS_RF);
- update_db_intercept(&svm->vcpu);
}

if (svm->vcpu.guest_debug &
@@ -3661,7 +3654,6 @@ static void enable_nmi_window(struct kvm_vcpu *vcpu)
*/
svm->nmi_singlestep = true;
svm->vmcb->save.rflags |= (X86_EFLAGS_TF | X86_EFLAGS_RF);
- update_db_intercept(vcpu);
}

static int svm_set_tss_addr(struct kvm *kvm, unsigned int addr)
--
1.9.1

2016-03-16 08:19:39

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 088/107] KVM: x86: work around infinite loop in microcode when #AC is delivered

From: Eric Northup <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 54a20552e1eae07aa240fa370a0293e006b5faed upstream.

It was found that a guest can DoS a host by triggering an infinite
stream of "alignment check" (#AC) exceptions. This causes the
microcode to enter an infinite loop where the core never receives
another interrupt. The host kernel panics pretty quickly due to the
effects (CVE-2015-5307).

Signed-off-by: Eric Northup <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
[lizf: Backported to 3.4:
- adjust filename
- adjust context
- add definition of AC_VECTOR]
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm.c | 8 ++++++++
arch/x86/kvm/trace.h | 1 +
arch/x86/kvm/vmx.c | 5 ++++-
4 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d60facb..493b026 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -87,6 +87,7 @@
#define GP_VECTOR 13
#define PF_VECTOR 14
#define MF_VECTOR 16
+#define AC_VECTOR 17
#define MC_VECTOR 18

#define SELECTOR_TI_MASK (1 << 2)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 86c74c0..56dd88a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1081,6 +1081,7 @@ static void init_vmcb(struct vcpu_svm *svm)
set_exception_intercept(svm, PF_VECTOR);
set_exception_intercept(svm, UD_VECTOR);
set_exception_intercept(svm, MC_VECTOR);
+ set_exception_intercept(svm, AC_VECTOR);

set_intercept(svm, INTERCEPT_INTR);
set_intercept(svm, INTERCEPT_NMI);
@@ -1776,6 +1777,12 @@ static int ud_interception(struct vcpu_svm *svm)
return 1;
}

+static int ac_interception(struct vcpu_svm *svm)
+{
+ kvm_queue_exception_e(&svm->vcpu, AC_VECTOR, 0);
+ return 1;
+}
+
static void svm_fpu_activate(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -3291,6 +3298,7 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm) = {
[SVM_EXIT_EXCP_BASE + PF_VECTOR] = pf_interception,
[SVM_EXIT_EXCP_BASE + NM_VECTOR] = nm_interception,
[SVM_EXIT_EXCP_BASE + MC_VECTOR] = mc_interception,
+ [SVM_EXIT_EXCP_BASE + AC_VECTOR] = ac_interception,
[SVM_EXIT_INTR] = intr_interception,
[SVM_EXIT_NMI] = nmi_interception,
[SVM_EXIT_SMI] = nop_on_interception,
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 911d264..d26a7e2 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -244,6 +244,7 @@ TRACE_EVENT(kvm_apic,
{ SVM_EXIT_EXCP_BASE + UD_VECTOR, "UD excp" }, \
{ SVM_EXIT_EXCP_BASE + PF_VECTOR, "PF excp" }, \
{ SVM_EXIT_EXCP_BASE + NM_VECTOR, "NM excp" }, \
+ { SVM_EXIT_EXCP_BASE + AC_VECTOR, "AC excp" }, \
{ SVM_EXIT_EXCP_BASE + MC_VECTOR, "MC excp" }, \
{ SVM_EXIT_INTR, "interrupt" }, \
{ SVM_EXIT_NMI, "nmi" }, \
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4ad0d71..defd510 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1169,7 +1169,7 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
u32 eb;

eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) |
- (1u << NM_VECTOR) | (1u << DB_VECTOR);
+ (1u << NM_VECTOR) | (1u << DB_VECTOR) | (1u << AC_VECTOR);
if ((vcpu->guest_debug &
(KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP)) ==
(KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP))
@@ -4260,6 +4260,9 @@ static int handle_exception(struct kvm_vcpu *vcpu)

ex_no = intr_info & INTR_INFO_VECTOR_MASK;
switch (ex_no) {
+ case AC_VECTOR:
+ kvm_queue_exception_e(vcpu, AC_VECTOR, error_code);
+ return 1;
case DB_VECTOR:
dr6 = vmcs_readl(EXIT_QUALIFICATION);
if (!(vcpu->guest_debug &
--
1.9.1

2016-03-16 08:20:03

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 085/107] KEYS: Fix race between key destruction and finding a keyring by name

From: David Howells <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 94c4554ba07adbdde396748ee7ae01e86cf2d8d7 upstream.

There appears to be a race between:

(1) key_gc_unused_keys() which frees key->security and then calls
keyring_destroy() to unlink the name from the name list

(2) find_keyring_by_name() which calls key_permission(), thus accessing
key->security, on a key before checking to see whether the key usage is 0
(ie. the key is dead and might be cleaned up).

Fix this by calling ->destroy() before cleaning up the core key data -
including key->security.

Reported-by: Petr Matousek <[email protected]>
Signed-off-by: David Howells <[email protected]>
[lizf: Backported to 3.4: adjust indentation]
Signed-off-by: Zefan Li <[email protected]>
---
security/keys/gc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/keys/gc.c b/security/keys/gc.c
index 87632bd..f85d638 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -174,6 +174,10 @@ static noinline void key_gc_unused_key(struct key *key)
{
key_check(key);

+ /* Throw away the key data */
+ if (key->type->destroy)
+ key->type->destroy(key);
+
security_key_free(key);

/* deal with the user's key tracking and quota */
@@ -188,10 +192,6 @@ static noinline void key_gc_unused_key(struct key *key)
if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags))
atomic_dec(&key->user->nikeys);

- /* now throw away the key memory */
- if (key->type->destroy)
- key->type->destroy(key);
-
key_user_put(key->user);

kfree(key->description);
--
1.9.1

2016-03-16 08:20:27

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 084/107] USB: whiteheat: fix potential null-deref at probe

From: Johan Hovold <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit cbb4be652d374f64661137756b8f357a1827d6a4 upstream.

Fix potential null-pointer dereference at probe by making sure that the
required endpoints are present.

The whiteheat driver assumes there are at least five pairs of bulk
endpoints, of which the final pair is used for the "command port". An
attempt to bind to an interface with fewer bulk endpoints would
currently lead to an oops.

Fixes CVE-2015-5257.

Reported-by: Moein Ghasemzadeh <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/serial/whiteheat.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/drivers/usb/serial/whiteheat.c b/drivers/usb/serial/whiteheat.c
index bf7014d..5cf9db1 100644
--- a/drivers/usb/serial/whiteheat.c
+++ b/drivers/usb/serial/whiteheat.c
@@ -91,6 +91,8 @@ static int whiteheat_firmware_download(struct usb_serial *serial,
static int whiteheat_firmware_attach(struct usb_serial *serial);

/* function prototypes for the Connect Tech WhiteHEAT serial converter */
+static int whiteheat_probe(struct usb_serial *serial,
+ const struct usb_device_id *id);
static int whiteheat_attach(struct usb_serial *serial);
static void whiteheat_release(struct usb_serial *serial);
static int whiteheat_open(struct tty_struct *tty,
@@ -134,6 +136,7 @@ static struct usb_serial_driver whiteheat_device = {
.description = "Connect Tech - WhiteHEAT",
.id_table = id_table_std,
.num_ports = 4,
+ .probe = whiteheat_probe,
.attach = whiteheat_attach,
.release = whiteheat_release,
.open = whiteheat_open,
@@ -336,6 +339,34 @@ static int whiteheat_firmware_attach(struct usb_serial *serial)
/*****************************************************************************
* Connect Tech's White Heat serial driver functions
*****************************************************************************/
+
+static int whiteheat_probe(struct usb_serial *serial,
+ const struct usb_device_id *id)
+{
+ struct usb_host_interface *iface_desc;
+ struct usb_endpoint_descriptor *endpoint;
+ size_t num_bulk_in = 0;
+ size_t num_bulk_out = 0;
+ size_t min_num_bulk;
+ unsigned int i;
+
+ iface_desc = serial->interface->cur_altsetting;
+
+ for (i = 0; i < iface_desc->desc.bNumEndpoints; i++) {
+ endpoint = &iface_desc->endpoint[i].desc;
+ if (usb_endpoint_is_bulk_in(endpoint))
+ ++num_bulk_in;
+ if (usb_endpoint_is_bulk_out(endpoint))
+ ++num_bulk_out;
+ }
+
+ min_num_bulk = COMMAND_PORT + 1;
+ if (num_bulk_in < min_num_bulk || num_bulk_out < min_num_bulk)
+ return -ENODEV;
+
+ return 0;
+}
+
static int whiteheat_attach(struct usb_serial *serial)
{
struct usb_serial_port *command_port;
--
1.9.1

2016-03-16 08:20:45

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 081/107] virtio-net: drop NETIF_F_FRAGLIST

From: Jason Wang <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 48900cb6af4282fa0fb6ff4d72a81aa3dadb5c39 upstream.

virtio declares support for NETIF_F_FRAGLIST, but assumes
that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
always true with a fraglist.

A longer fraglist in the skb will make the call to skb_to_sgvec overflow
the sg array, leading to memory corruption.

Drop NETIF_F_FRAGLIST so we only get what we can handle.

Cc: Michael S. Tsirkin <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/net/virtio_net.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index efb50d6..27c82ee 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1036,9 +1036,9 @@ static int virtnet_probe(struct virtio_device *vdev)
/* Do we support "hardware" checksums? */
if (virtio_has_feature(vdev, VIRTIO_NET_F_CSUM)) {
/* This opens up the world of extra features. */
- dev->hw_features |= NETIF_F_HW_CSUM|NETIF_F_SG|NETIF_F_FRAGLIST;
+ dev->hw_features |= NETIF_F_HW_CSUM | NETIF_F_SG;
if (csum)
- dev->features |= NETIF_F_HW_CSUM|NETIF_F_SG|NETIF_F_FRAGLIST;
+ dev->features |= NETIF_F_HW_CSUM | NETIF_F_SG;

if (virtio_has_feature(vdev, VIRTIO_NET_F_GSO)) {
dev->hw_features |= NETIF_F_TSO | NETIF_F_UFO
--
1.9.1

2016-03-16 08:21:35

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 080/107] sg_start_req(): make sure that there's not too many elements in iovec

From: Al Viro <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 451a2886b6bf90e2fb378f7c46c655450fb96e81 upstream.

unfortunately, allowing an arbitrary 16bit value means a possibility of
overflow in the calculation of total number of pages in bio_map_user_iov() -
we rely on there being no more than PAGE_SIZE members of sum in the
first loop there. If that sum wraps around, we end up allocating
too small array of pointers to pages and it's easy to overflow it in
the second loop.

X-Coverup: TINC (and there's no lumber cartel either)
Signed-off-by: Al Viro <[email protected]>
[lizf: Backported to 3.4: s/MAX_UIOVEC/UIO_MAXIOV]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/scsi/sg.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index fb119ce..ebe83f7 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1687,6 +1687,9 @@ static int sg_start_req(Sg_request *srp, unsigned char *cmd)
md->from_user = 0;
}

+ if (unlikely(iov_count > UIO_MAXIOV))
+ return -EINVAL;
+
if (iov_count) {
int len, size = sizeof(struct sg_iovec) * iov_count;
struct iovec *iov;
--
1.9.1

2016-03-16 08:22:04

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 079/107] RDS: fix race condition when sending a message on unbound socket

From: Quentin Casasnovas <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 8c7188b23474cca017b3ef354c4a58456f68303a upstream.

Sasha's found a NULL pointer dereference in the RDS connection code when
sending a message to an apparently unbound socket. The problem is caused
by the code checking if the socket is bound in rds_sendmsg(), which checks
the rs_bound_addr field without taking a lock on the socket. This opens a
race where rs_bound_addr is temporarily set but where the transport is not
in rds_bind(), leading to a NULL pointer dereference when trying to
dereference 'trans' in __rds_conn_create().

Vegard wrote a reproducer for this issue, so kindly ask him to share if
you're interested.

I cannot reproduce the NULL pointer dereference using Vegard's reproducer
with this patch, whereas I could without.

Complete earlier incomplete fix to CVE-2015-6937:

74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection")

Cc: David S. Miller <[email protected]>

Reviewed-by: Vegard Nossum <[email protected]>
Reviewed-by: Sasha Levin <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Quentin Casasnovas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/rds/connection.c | 6 ------
net/rds/send.c | 4 +++-
2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index be3eecd..9e07c75 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -178,12 +178,6 @@ static struct rds_connection *__rds_conn_create(__be32 laddr, __be32 faddr,
}
}

- if (trans == NULL) {
- kmem_cache_free(rds_conn_slab, conn);
- conn = ERR_PTR(-ENODEV);
- goto out;
- }
-
conn->c_trans = trans;

ret = trans->conn_alloc(conn, gfp);
diff --git a/net/rds/send.c b/net/rds/send.c
index 88eace5..31c9fa4 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -955,11 +955,13 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
release_sock(sk);
}

- /* racing with another thread binding seems ok here */
+ lock_sock(sk);
if (daddr == 0 || rs->rs_bound_addr == 0) {
+ release_sock(sk);
ret = -ENOTCONN; /* XXX not a great errno */
goto out;
}
+ release_sock(sk);

/* size of rm including all sgs */
ret = rds_rm_size(msg, payload_len);
--
1.9.1

2016-03-16 08:11:54

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 074/107] Revert "usb: dwc3: Reset the transfer resource index on SET_INTERFACE"

From: Zefan Li <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


It was applied to the wrong function.

This reverts commit 15488de7b72b6ab8254dda07053faa4be6b9ec66.
---
drivers/usb/dwc3/ep0.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
index 1d55451..7c0eaeb 100644
--- a/drivers/usb/dwc3/ep0.c
+++ b/drivers/usb/dwc3/ep0.c
@@ -648,10 +648,6 @@ static void dwc3_ep0_xfer_complete(struct dwc3 *dwc,
dev_vdbg(dwc->dev, "Status Phase\n");
dwc3_ep0_complete_req(dwc, event);
break;
- case USB_REQ_SET_INTERFACE:
- dev_vdbg(dwc->dev, "USB_REQ_SET_INTERFACE\n");
- dwc->start_config_issued = false;
- /* Fall through */
default:
WARN(true, "UNKNOWN ep0state %d\n", dwc->ep0state);
}
--
1.9.1

2016-03-16 08:22:51

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 077/107] net: add validation for the socket syscall protocol argument

From: Hannes Frederic Sowa <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 79462ad02e861803b3840cc782248c7359451cd9 upstream.

郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

int socket_fd;
struct sockaddr_in addr;
addr.sin_port = 0;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_family = 10;

socket_fd = socket(10,3,0x40000000);
connect(socket_fd , &addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70
kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110
kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10
kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang <[email protected]>
Reported-by: 郭永刚 <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: open-code U8_MAX]
Signed-off-by: Zefan Li <[email protected]>
---
include/net/sock.h | 1 +
net/ax25/af_ax25.c | 3 +++
net/decnet/af_decnet.c | 3 +++
net/ipv4/af_inet.c | 3 +++
net/ipv6/af_inet6.c | 3 +++
net/irda/af_irda.c | 3 +++
6 files changed, 16 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index f673ba5..e2073e0 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -328,6 +328,7 @@ struct sock {
sk_no_check : 2,
sk_userlocks : 4,
sk_protocol : 8,
+#define SK_PROTOCOL_MAX ((u8)~0U)
sk_type : 16;
kmemcheck_bitfield_end(flags);
int sk_wmem_queued;
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index ca1820c..f59c8af 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -811,6 +811,9 @@ static int ax25_create(struct net *net, struct socket *sock, int protocol,
struct sock *sk;
ax25_cb *ax25;

+ if (protocol < 0 || protocol > SK_PROTOCOL_MAX)
+ return -EINVAL;
+
if (!net_eq(net, &init_net))
return -EAFNOSUPPORT;

diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index 4136987..4fa941e 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -680,6 +680,9 @@ static int dn_create(struct net *net, struct socket *sock, int protocol,
{
struct sock *sk;

+ if (protocol < 0 || protocol > SK_PROTOCOL_MAX)
+ return -EINVAL;
+
if (!net_eq(net, &init_net))
return -EAFNOSUPPORT;

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 78ec298..0a828e2 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -284,6 +284,9 @@ static int inet_create(struct net *net, struct socket *sock, int protocol,
if (sock->type != SOCK_RAW && sock->type != SOCK_DGRAM)
build_ehash_secret();

+ if (protocol < 0 || protocol >= IPPROTO_MAX)
+ return -EINVAL;
+
sock->state = SS_UNCONNECTED;

/* Look for the requested type/protocol pair. */
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 8ed1b93..5300ef3 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -113,6 +113,9 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
!inet_ehash_secret)
build_ehash_secret();

+ if (protocol < 0 || protocol >= IPPROTO_MAX)
+ return -EINVAL;
+
/* Look for the requested type/protocol pair. */
lookup_protocol:
err = -ESOCKTNOSUPPORT;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 12218f7..3eaf4fe 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1106,6 +1106,9 @@ static int irda_create(struct net *net, struct socket *sock, int protocol,

IRDA_DEBUG(2, "%s()\n", __func__);

+ if (protocol < 0 || protocol > SK_PROTOCOL_MAX)
+ return -EINVAL;
+
if (net != &init_net)
return -EAFNOSUPPORT;

--
1.9.1

2016-03-16 08:23:41

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 076/107] pptp: verify sockaddr_len in pptp_bind() and pptp_connect()

From: WANG Cong <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 09ccfd238e5a0e670d8178cf50180ea81ae09ae1 upstream.

Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/net/ppp/pptp.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/ppp/pptp.c b/drivers/net/ppp/pptp.c
index 9f047a0..5a13ad0 100644
--- a/drivers/net/ppp/pptp.c
+++ b/drivers/net/ppp/pptp.c
@@ -420,6 +420,9 @@ static int pptp_bind(struct socket *sock, struct sockaddr *uservaddr,
struct pptp_opt *opt = &po->proto.pptp;
int error = 0;

+ if (sockaddr_len < sizeof(struct sockaddr_pppox))
+ return -EINVAL;
+
lock_sock(sk);

opt->src_addr = sp->sa_addr.pptp;
@@ -441,6 +444,9 @@ static int pptp_connect(struct socket *sock, struct sockaddr *uservaddr,
struct flowi4 fl4;
int error = 0;

+ if (sockaddr_len < sizeof(struct sockaddr_pppox))
+ return -EINVAL;
+
if (sp->sa_protocol != PX_PROTO_PPTP)
return -EINVAL;

--
1.9.1

2016-03-16 08:23:38

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 078/107] RDS: verify the underlying transport exists before creating a connection

From: Sasha Levin <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 74e98eb085889b0d2d4908f59f6e00026063014f upstream.

There was no verification that an underlying transport exists when creating
a connection, this would cause dereferencing a NULL ptr.

It might happen on sockets that weren't properly bound before attempting to
send a message, which will cause a NULL ptr deref:

[135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[135546.051270] Modules linked in:
[135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted 4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
[135546.053217] task: ffff8800835bc000 ti: ffff8800bc708000 task.ti: ffff8800bc708000
[135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
[135546.055666] RSP: 0018:ffff8800bc70fab0 EFLAGS: 00010202
[135546.056457] RAX: dffffc0000000000 RBX: 0000000000000f2c RCX: ffff8800835bc000
[135546.057494] RDX: 0000000000000007 RSI: ffff8800835bccd8 RDI: 0000000000000038
[135546.058530] RBP: ffff8800bc70fb18 R08: 0000000000000001 R09: 0000000000000000
[135546.059556] R10: ffffed014d7a3a23 R11: ffffed014d7a3a21 R12: 0000000000000000
[135546.060614] R13: 0000000000000001 R14: ffff8801ec3d0000 R15: 0000000000000000
[135546.061668] FS: 00007faad4ffb700(0000) GS:ffff880252000000(0000) knlGS:0000000000000000
[135546.062836] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[135546.063682] CR2: 000000000000846a CR3: 000000009d137000 CR4: 00000000000006a0
[135546.064723] Stack:
[135546.065048] ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
[135546.066247] 0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
[135546.067438] 1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
[135546.068629] Call Trace:
[135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
[135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
[135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
[135546.071981] rds_sendmsg (net/rds/send.c:1058)
[135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
[135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
[135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
[135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
[135546.076349] ? __might_fault (mm/memory.c:3795)
[135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
[135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
[135546.078856] SYSC_sendto (net/socket.c:1657)
[135546.079596] ? SYSC_connect (net/socket.c:1628)
[135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
[135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
[135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
[135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
[135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
[135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
[135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1

Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/rds/connection.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index 9e07c75..be3eecd 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -178,6 +178,12 @@ static struct rds_connection *__rds_conn_create(__be32 laddr, __be32 faddr,
}
}

+ if (trans == NULL) {
+ kmem_cache_free(rds_conn_slab, conn);
+ conn = ERR_PTR(-ENODEV);
+ goto out;
+ }
+
conn->c_trans = trans;

ret = trans->conn_alloc(conn, gfp);
--
1.9.1

2016-03-16 08:11:31

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 065/107] x86/ldt: Correct FPU emulation access to LDT

From: Juergen Gross <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4809146b86c3d41ce588fdb767d021e2a80600dd upstream.

Commit 37868fe113ff ("x86/ldt: Make modify_ldt synchronous")
introduced a new struct ldt_struct anchored at mm->context.ldt.

Adapt the x86 fpu emulation code to use that new structure.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/math-emu/fpu_entry.c | 3 +--
arch/x86/math-emu/fpu_system.h | 21 ++++++++++++++++++---
arch/x86/math-emu/get_address.c | 3 +--
3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index 9b86812..274a52b 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -29,7 +29,6 @@

#include <asm/uaccess.h>
#include <asm/traps.h>
-#include <asm/desc.h>
#include <asm/user.h>
#include <asm/i387.h>

@@ -185,7 +184,7 @@ void math_emulate(struct math_emu_info *info)
math_abort(FPU_info, SIGILL);
}

- code_descriptor = LDT_DESCRIPTOR(FPU_CS);
+ code_descriptor = FPU_get_ldt_descriptor(FPU_CS);
if (SEG_D_SIZE(code_descriptor)) {
/* The above test may be wrong, the book is not clear */
/* Segmented 32 bit protected mode */
diff --git a/arch/x86/math-emu/fpu_system.h b/arch/x86/math-emu/fpu_system.h
index 2c61441..d342fce 100644
--- a/arch/x86/math-emu/fpu_system.h
+++ b/arch/x86/math-emu/fpu_system.h
@@ -16,9 +16,24 @@
#include <linux/kernel.h>
#include <linux/mm.h>

-/* s is always from a cpu register, and the cpu does bounds checking
- * during register load --> no further bounds checks needed */
-#define LDT_DESCRIPTOR(s) (((struct desc_struct *)current->mm->context.ldt)[(s) >> 3])
+#include <asm/desc.h>
+#include <asm/mmu_context.h>
+
+static inline struct desc_struct FPU_get_ldt_descriptor(unsigned seg)
+{
+ static struct desc_struct zero_desc;
+ struct desc_struct ret = zero_desc;
+
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
+ seg >>= 3;
+ mutex_lock(&current->mm->context.lock);
+ if (current->mm->context.ldt && seg < current->mm->context.ldt->size)
+ ret = current->mm->context.ldt->entries[seg];
+ mutex_unlock(&current->mm->context.lock);
+#endif
+ return ret;
+}
+
#define SEG_D_SIZE(x) ((x).b & (3 << 21))
#define SEG_G_BIT(x) ((x).b & (1 << 23))
#define SEG_GRANULARITY(x) (((x).b & (1 << 23)) ? 4096 : 1)
diff --git a/arch/x86/math-emu/get_address.c b/arch/x86/math-emu/get_address.c
index 6ef5e99..d13cab2 100644
--- a/arch/x86/math-emu/get_address.c
+++ b/arch/x86/math-emu/get_address.c
@@ -20,7 +20,6 @@
#include <linux/stddef.h>

#include <asm/uaccess.h>
-#include <asm/desc.h>

#include "fpu_system.h"
#include "exception.h"
@@ -158,7 +157,7 @@ static long pm_address(u_char FPU_modrm, u_char segment,
addr->selector = PM_REG_(segment);
}

- descriptor = LDT_DESCRIPTOR(PM_REG_(segment));
+ descriptor = FPU_get_ldt_descriptor(segment);
base_address = SEG_BASE_ADDR(descriptor);
address = base_address + offset;
limit = base_address
--
1.9.1

2016-03-16 08:24:33

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 072/107] net: Fix RCU splat in af_key

From: David Ahern <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit ba51b6be38c122f7dab40965b4397aaf6188a464 upstream.

Hit the following splat testing VRF change for ipsec:

[ 113.475692] ===============================
[ 113.476194] [ INFO: suspicious RCU usage. ]
[ 113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
[ 113.477545] -------------------------------
[ 113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
[ 113.479288]
[ 113.479288] other info that might help us debug this:
[ 113.479288]
[ 113.480207]
[ 113.480207] rcu_scheduler_active = 1, debug_locks = 1
[ 113.480931] 2 locks held by setkey/6829:
[ 113.481371] #0: (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
[ 113.482509] #1: (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
[ 113.483509]
[ 113.483509] stack backtrace:
[ 113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
[ 113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
[ 113.486845] 0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
[ 113.487732] ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
[ 113.488628] 0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
[ 113.489525] Call Trace:
[ 113.489813] [<ffffffff81518af2>] dump_stack+0x4c/0x65
[ 113.490389] [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
[ 113.491039] [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
[ 113.491735] [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
[ 113.492442] [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
[ 113.493077] [<ffffffff81064268>] __might_sleep+0x6c/0x82
[ 113.493681] [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
[ 113.494508] [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
[ 113.495149] [<ffffffff814012b5>] skb_clone+0x64/0x80
[ 113.495712] [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
[ 113.496380] [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
[ 113.497024] [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
[ 113.497653] [<ffffffff814e9770>] pfkey_process+0x162/0x17e
[ 113.498274] [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213

In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
the RCU lock.

Since pfkey_broadcast takes the RCU lock the allocation argument is
pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
The one call outside of rcu can be done with GFP_KERNEL.

Fixes: 7f6b9dbd5afbd ("af_key: locking change")
Signed-off-by: David Ahern <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/key/af_key.c | 46 +++++++++++++++++++++++-----------------------
1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index d5cd439..eb6ce3b 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -220,7 +220,7 @@ static int pfkey_broadcast_one(struct sk_buff *skb, struct sk_buff **skb2,
#define BROADCAST_ONE 1
#define BROADCAST_REGISTERED 2
#define BROADCAST_PROMISC_ONLY 4
-static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
+static int pfkey_broadcast(struct sk_buff *skb,
int broadcast_flags, struct sock *one_sk,
struct net *net)
{
@@ -246,7 +246,7 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
* socket.
*/
if (pfk->promisc)
- pfkey_broadcast_one(skb, &skb2, allocation, sk);
+ pfkey_broadcast_one(skb, &skb2, GFP_ATOMIC, sk);

/* the exact target will be processed later */
if (sk == one_sk)
@@ -261,7 +261,7 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
continue;
}

- err2 = pfkey_broadcast_one(skb, &skb2, allocation, sk);
+ err2 = pfkey_broadcast_one(skb, &skb2, GFP_ATOMIC, sk);

/* Error is cleare after succecful sending to at least one
* registered KM */
@@ -271,7 +271,7 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
rcu_read_unlock();

if (one_sk != NULL)
- err = pfkey_broadcast_one(skb, &skb2, allocation, one_sk);
+ err = pfkey_broadcast_one(skb, &skb2, GFP_KERNEL, one_sk);

kfree_skb(skb2);
kfree_skb(skb);
@@ -294,7 +294,7 @@ static int pfkey_do_dump(struct pfkey_sock *pfk)
hdr = (struct sadb_msg *) pfk->dump.skb->data;
hdr->sadb_msg_seq = 0;
hdr->sadb_msg_errno = rc;
- pfkey_broadcast(pfk->dump.skb, GFP_ATOMIC, BROADCAST_ONE,
+ pfkey_broadcast(pfk->dump.skb, BROADCAST_ONE,
&pfk->sk, sock_net(&pfk->sk));
pfk->dump.skb = NULL;
}
@@ -335,7 +335,7 @@ static int pfkey_error(const struct sadb_msg *orig, int err, struct sock *sk)
hdr->sadb_msg_len = (sizeof(struct sadb_msg) /
sizeof(uint64_t));

- pfkey_broadcast(skb, GFP_KERNEL, BROADCAST_ONE, sk, sock_net(sk));
+ pfkey_broadcast(skb, BROADCAST_ONE, sk, sock_net(sk));

return 0;
}
@@ -1361,7 +1361,7 @@ static int pfkey_getspi(struct sock *sk, struct sk_buff *skb, const struct sadb_

xfrm_state_put(x);

- pfkey_broadcast(resp_skb, GFP_KERNEL, BROADCAST_ONE, sk, net);
+ pfkey_broadcast(resp_skb, BROADCAST_ONE, sk, net);

return 0;
}
@@ -1449,7 +1449,7 @@ static int key_notify_sa(struct xfrm_state *x, const struct km_event *c)
hdr->sadb_msg_seq = c->seq;
hdr->sadb_msg_pid = c->pid;

- pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_ALL, NULL, xs_net(x));
+ pfkey_broadcast(skb, BROADCAST_ALL, NULL, xs_net(x));

return 0;
}
@@ -1566,7 +1566,7 @@ static int pfkey_get(struct sock *sk, struct sk_buff *skb, const struct sadb_msg
out_hdr->sadb_msg_reserved = 0;
out_hdr->sadb_msg_seq = hdr->sadb_msg_seq;
out_hdr->sadb_msg_pid = hdr->sadb_msg_pid;
- pfkey_broadcast(out_skb, GFP_ATOMIC, BROADCAST_ONE, sk, sock_net(sk));
+ pfkey_broadcast(out_skb, BROADCAST_ONE, sk, sock_net(sk));

return 0;
}
@@ -1667,7 +1667,7 @@ static int pfkey_register(struct sock *sk, struct sk_buff *skb, const struct sad
return -ENOBUFS;
}

- pfkey_broadcast(supp_skb, GFP_KERNEL, BROADCAST_REGISTERED, sk, sock_net(sk));
+ pfkey_broadcast(supp_skb, BROADCAST_REGISTERED, sk, sock_net(sk));

return 0;
}
@@ -1686,7 +1686,7 @@ static int unicast_flush_resp(struct sock *sk, const struct sadb_msg *ihdr)
hdr->sadb_msg_errno = (uint8_t) 0;
hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));

- return pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_ONE, sk, sock_net(sk));
+ return pfkey_broadcast(skb, BROADCAST_ONE, sk, sock_net(sk));
}

static int key_notify_sa_flush(const struct km_event *c)
@@ -1707,7 +1707,7 @@ static int key_notify_sa_flush(const struct km_event *c)
hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));
hdr->sadb_msg_reserved = 0;

- pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_ALL, NULL, c->net);
+ pfkey_broadcast(skb, BROADCAST_ALL, NULL, c->net);

return 0;
}
@@ -1768,7 +1768,7 @@ static int dump_sa(struct xfrm_state *x, int count, void *ptr)
out_hdr->sadb_msg_pid = pfk->dump.msg_pid;

if (pfk->dump.skb)
- pfkey_broadcast(pfk->dump.skb, GFP_ATOMIC, BROADCAST_ONE,
+ pfkey_broadcast(pfk->dump.skb, BROADCAST_ONE,
&pfk->sk, sock_net(&pfk->sk));
pfk->dump.skb = out_skb;

@@ -1829,7 +1829,7 @@ static int pfkey_promisc(struct sock *sk, struct sk_buff *skb, const struct sadb
new_hdr->sadb_msg_errno = 0;
}

- pfkey_broadcast(skb, GFP_KERNEL, BROADCAST_ALL, NULL, sock_net(sk));
+ pfkey_broadcast(skb, BROADCAST_ALL, NULL, sock_net(sk));
return 0;
}

@@ -2160,7 +2160,7 @@ static int key_notify_policy(struct xfrm_policy *xp, int dir, const struct km_ev
out_hdr->sadb_msg_errno = 0;
out_hdr->sadb_msg_seq = c->seq;
out_hdr->sadb_msg_pid = c->pid;
- pfkey_broadcast(out_skb, GFP_ATOMIC, BROADCAST_ALL, NULL, xp_net(xp));
+ pfkey_broadcast(out_skb, BROADCAST_ALL, NULL, xp_net(xp));
return 0;

}
@@ -2386,7 +2386,7 @@ static int key_pol_get_resp(struct sock *sk, struct xfrm_policy *xp, const struc
out_hdr->sadb_msg_errno = 0;
out_hdr->sadb_msg_seq = hdr->sadb_msg_seq;
out_hdr->sadb_msg_pid = hdr->sadb_msg_pid;
- pfkey_broadcast(out_skb, GFP_ATOMIC, BROADCAST_ONE, sk, xp_net(xp));
+ pfkey_broadcast(out_skb, BROADCAST_ONE, sk, xp_net(xp));
err = 0;

out:
@@ -2639,7 +2639,7 @@ static int dump_sp(struct xfrm_policy *xp, int dir, int count, void *ptr)
out_hdr->sadb_msg_pid = pfk->dump.msg_pid;

if (pfk->dump.skb)
- pfkey_broadcast(pfk->dump.skb, GFP_ATOMIC, BROADCAST_ONE,
+ pfkey_broadcast(pfk->dump.skb, BROADCAST_ONE,
&pfk->sk, sock_net(&pfk->sk));
pfk->dump.skb = out_skb;

@@ -2690,7 +2690,7 @@ static int key_notify_policy_flush(const struct km_event *c)
hdr->sadb_msg_satype = SADB_SATYPE_UNSPEC;
hdr->sadb_msg_len = (sizeof(struct sadb_msg) / sizeof(uint64_t));
hdr->sadb_msg_reserved = 0;
- pfkey_broadcast(skb_out, GFP_ATOMIC, BROADCAST_ALL, NULL, c->net);
+ pfkey_broadcast(skb_out, BROADCAST_ALL, NULL, c->net);
return 0;

}
@@ -2756,7 +2756,7 @@ static int pfkey_process(struct sock *sk, struct sk_buff *skb, const struct sadb
void *ext_hdrs[SADB_EXT_MAX];
int err;

- pfkey_broadcast(skb_clone(skb, GFP_KERNEL), GFP_KERNEL,
+ pfkey_broadcast(skb_clone(skb, GFP_KERNEL),
BROADCAST_PROMISC_ONLY, NULL, sock_net(sk));

memset(ext_hdrs, 0, sizeof(ext_hdrs));
@@ -2962,7 +2962,7 @@ static int key_notify_sa_expire(struct xfrm_state *x, const struct km_event *c)
out_hdr->sadb_msg_seq = 0;
out_hdr->sadb_msg_pid = 0;

- pfkey_broadcast(out_skb, GFP_ATOMIC, BROADCAST_REGISTERED, NULL, xs_net(x));
+ pfkey_broadcast(out_skb, BROADCAST_REGISTERED, NULL, xs_net(x));
return 0;
}

@@ -3134,7 +3134,7 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct
xfrm_ctx->ctx_len);
}

- return pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_REGISTERED, NULL, xs_net(x));
+ return pfkey_broadcast(skb, BROADCAST_REGISTERED, NULL, xs_net(x));
}

static struct xfrm_policy *pfkey_compile_policy(struct sock *sk, int opt,
@@ -3332,7 +3332,7 @@ static int pfkey_send_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
n_port->sadb_x_nat_t_port_port = sport;
n_port->sadb_x_nat_t_port_reserved = 0;

- return pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_REGISTERED, NULL, xs_net(x));
+ return pfkey_broadcast(skb, BROADCAST_REGISTERED, NULL, xs_net(x));
}

#ifdef CONFIG_NET_KEY_MIGRATE
@@ -3524,7 +3524,7 @@ static int pfkey_send_migrate(const struct xfrm_selector *sel, u8 dir, u8 type,
}

/* broadcast migrate message to sockets */
- pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_ALL, NULL, &init_net);
+ pfkey_broadcast(skb, BROADCAST_ALL, NULL, &init_net);

return 0;

--
1.9.1

2016-03-16 08:24:57

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 071/107] ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits

From: "Herton R. Krzesinski" <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 602b8593d2b4138c10e922eeaafe306f6b51817b upstream.

The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).

For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:

Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b kkkkkkkk.kkkkkkk
010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff ....kkkk........
Prev obj: start=ffff88003b45c180, len=64
000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a .....N......ZZZZ
010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff ...........7....
Next obj: start=ffff88003b45c200, len=64
000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a .....N......ZZZZ
010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff ........h).<....
BUG: spinlock wrong CPU on CPU#2, test/18028
general protection fault: 0000 [#1] SMP
Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
RIP: spin_dump+0x53/0xc0
Call Trace:
spin_bug+0x30/0x40
do_raw_spin_unlock+0x71/0xa0
_raw_spin_unlock+0xe/0x10
freeary+0x82/0x2a0
? _raw_spin_lock+0xe/0x10
semctl_down.clone.0+0xce/0x160
? __do_page_fault+0x19a/0x430
? __audit_syscall_entry+0xa8/0x100
SyS_semctl+0x236/0x2c0
? syscall_trace_leave+0xde/0x130
entry_SYSCALL_64_fastpath+0x12/0x71
Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
RIP [<ffffffff810d6053>] spin_dump+0x53/0xc0
RSP <ffff88003750fd68>
---[ end trace 783ebb76612867a0 ]---
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
CPU: 3 PID: 18053 Comm: test Tainted: G D 4.2.0-rc5+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
RIP: native_read_tsc+0x0/0x20
Call Trace:
? delay_tsc+0x40/0x70
__delay+0xf/0x20
do_raw_spin_lock+0x96/0x140
_raw_spin_lock+0xe/0x10
sem_lock_and_putref+0x11/0x70
SYSC_semtimedop+0x7bf/0x960
? handle_mm_fault+0xbf6/0x1880
? dequeue_task_fair+0x79/0x4a0
? __do_page_fault+0x19a/0x430
? kfree_debugcheck+0x16/0x40
? __do_page_fault+0x19a/0x430
? __audit_syscall_entry+0xa8/0x100
? do_audit_syscall_entry+0x66/0x70
? syscall_trace_enter_phase1+0x139/0x160
SyS_semtimedop+0xe/0x10
SyS_semop+0x10/0x20
entry_SYSCALL_64_fastpath+0x12/0x71
Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
Kernel panic - not syncing: softlockup: hung tasks

I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore
syscalls).

The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).

After the patch I'm unable to reproduce the problem using the test case
[1].

[1] Test case used below:

#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <errno.h>

#define NSEM 1
#define NSET 5

int sid[NSET];

void thread()
{
struct sembuf op;
int s;
uid_t pid = getuid();

s = rand() % NSET;
op.sem_num = pid % NSEM;
op.sem_op = 1;
op.sem_flg = SEM_UNDO;

semop(sid[s], &op, 1);
exit(EXIT_SUCCESS);
}

void create_set()
{
int i, j;
pid_t p;
union {
int val;
struct semid_ds *buf;
unsigned short int *array;
struct seminfo *__buf;
} un;

/* Create and initialize semaphore set */
for (i = 0; i < NSET; i++) {
sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
if (sid[i] < 0) {
perror("semget");
exit(EXIT_FAILURE);
}
}
un.val = 0;
for (i = 0; i < NSET; i++) {
for (j = 0; j < NSEM; j++) {
if (semctl(sid[i], j, SETVAL, un) < 0)
perror("semctl");
}
}

/* Launch threads that operate on semaphore set */
for (i = 0; i < NSEM * NSET * NSET; i++) {
p = fork();
if (p < 0)
perror("fork");
if (p == 0)
thread();
}

/* Free semaphore set */
for (i = 0; i < NSET; i++) {
if (semctl(sid[i], NSEM, IPC_RMID))
perror("IPC_RMID");
}

/* Wait for forked processes to exit */
while (wait(NULL)) {
if (errno == ECHILD)
break;
};
}

int main(int argc, char **argv)
{
pid_t p;

srand(time(NULL));

while (1) {
p = fork();
if (p < 0) {
perror("fork");
exit(EXIT_FAILURE);
}
if (p == 0) {
create_set();
goto end;
}

/* Wait for forked processes to exit */
while (wait(NULL)) {
if (errno == ECHILD)
break;
};
}
end:
return 0;
}

[[email protected]: use normal comment layout]
Signed-off-by: Herton R. Krzesinski <[email protected]>
Acked-by: Manfred Spraul <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Rafael Aquini <[email protected]>
CC: Aristeu Rozanski <[email protected]>
Cc: David Jeffery <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

Signed-off-by: Linus Torvalds <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
ipc/sem.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 5215a81..67f2110 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1606,16 +1606,27 @@ void exit_sem(struct task_struct *tsk)
rcu_read_lock();
un = list_entry_rcu(ulp->list_proc.next,
struct sem_undo, list_proc);
- if (&un->list_proc == &ulp->list_proc)
- semid = -1;
- else
- semid = un->semid;
+ if (&un->list_proc == &ulp->list_proc) {
+ /*
+ * We must wait for freeary() before freeing this ulp,
+ * in case we raced with last sem_undo. There is a small
+ * possibility where we exit while freeary() didn't
+ * finish unlocking sem_undo_list.
+ */
+ spin_unlock_wait(&ulp->lock);
+ rcu_read_unlock();
+ break;
+ }
+ spin_lock(&ulp->lock);
+ semid = un->semid;
+ spin_unlock(&ulp->lock);
rcu_read_unlock();

+ /* exit_sem raced with IPC_RMID, nothing to do */
if (semid == -1)
- break;
+ continue;

- sma = sem_lock_check(tsk->nsproxy->ipc_ns, un->semid);
+ sma = sem_lock_check(tsk->nsproxy->ipc_ns, semid);

/* exit_sem raced with IPC_RMID, nothing to do */
if (IS_ERR(sma))
--
1.9.1

2016-03-16 08:25:21

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 070/107] EDAC, ppc4xx: Access mci->csrows array elements properly

From: Michael Walle <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 5c16179b550b9fd8114637a56b153c9768ea06a5 upstream.

The commit

de3910eb79ac ("edac: change the mem allocation scheme to
make Documentation/kobject.txt happy")

changed the memory allocation for the csrows member. But ppc4xx_edac was
forgotten in the patch. Fix it.

Signed-off-by: Michael Walle <[email protected]>
Cc: linux-edac <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/edac/ppc4xx_edac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index d427c69..9212f6c 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -919,7 +919,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
*/

for (row = 0; row < mci->nr_csrows; row++) {
- struct csrow_info *csi = &mci->csrows[row];
+ struct csrow_info *csi = mci->csrows[row];

/*
* Get the configuration settings for this
--
1.9.1

2016-03-16 08:25:54

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 069/107] libfc: Fix fc_fcp_cleanup_each_cmd()

From: Bart Van Assche <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 8f2777f53e3d5ad8ef2a176a4463a5c8e1a16431 upstream.

Since fc_fcp_cleanup_cmd() can sleep this function must not
be called while holding a spinlock. This patch avoids that
fc_fcp_cleanup_each_cmd() triggers the following bug:

BUG: scheduling while atomic: sg_reset/1512/0x00000202
1 lock held by sg_reset/1512:
#0: (&(&fsp->scsi_pkt_lock)->rlock){+.-...}, at: [<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Preemption disabled at:[<ffffffffc0225cd5>] fc_fcp_cleanup_each_cmd.isra.21+0xa5/0x150 [libfc]
Call Trace:
[<ffffffff816c612c>] dump_stack+0x4f/0x7b
[<ffffffff810828bc>] __schedule_bug+0x6c/0xd0
[<ffffffff816c87aa>] __schedule+0x71a/0xa10
[<ffffffff816c8ad2>] schedule+0x32/0x80
[<ffffffffc0217eac>] fc_seq_set_resp+0xac/0x100 [libfc]
[<ffffffffc0218b11>] fc_exch_done+0x41/0x60 [libfc]
[<ffffffffc0225cff>] fc_fcp_cleanup_each_cmd.isra.21+0xcf/0x150 [libfc]
[<ffffffffc0225f43>] fc_eh_device_reset+0x1c3/0x270 [libfc]
[<ffffffff814a2cc9>] scsi_try_bus_device_reset+0x29/0x60
[<ffffffff814a3908>] scsi_ioctl_reset+0x258/0x2d0
[<ffffffff814a2650>] scsi_ioctl+0x150/0x440
[<ffffffff814b3a9d>] sd_ioctl+0xad/0x120
[<ffffffff8132f266>] blkdev_ioctl+0x1b6/0x810
[<ffffffff811da608>] block_ioctl+0x38/0x40
[<ffffffff811b4e08>] do_vfs_ioctl+0x2f8/0x530
[<ffffffff811b50c1>] SyS_ioctl+0x81/0xa0
[<ffffffff816cf8b2>] system_call_fastpath+0x16/0x7a

Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Vasu Dev <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/scsi/libfc/fc_fcp.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/libfc/fc_fcp.c b/drivers/scsi/libfc/fc_fcp.c
index f735730..c979fd0 100644
--- a/drivers/scsi/libfc/fc_fcp.c
+++ b/drivers/scsi/libfc/fc_fcp.c
@@ -1030,11 +1030,26 @@ restart:
fc_fcp_pkt_hold(fsp);
spin_unlock_irqrestore(&si->scsi_queue_lock, flags);

- if (!fc_fcp_lock_pkt(fsp)) {
+ spin_lock_bh(&fsp->scsi_pkt_lock);
+ if (!(fsp->state & FC_SRB_COMPL)) {
+ fsp->state |= FC_SRB_COMPL;
+ /*
+ * TODO: dropping scsi_pkt_lock and then reacquiring
+ * again around fc_fcp_cleanup_cmd() is required,
+ * since fc_fcp_cleanup_cmd() calls into
+ * fc_seq_set_resp() and that func preempts cpu using
+ * schedule. May be schedule and related code should be
+ * removed instead of unlocking here to avoid scheduling
+ * while atomic bug.
+ */
+ spin_unlock_bh(&fsp->scsi_pkt_lock);
+
fc_fcp_cleanup_cmd(fsp, error);
+
+ spin_lock_bh(&fsp->scsi_pkt_lock);
fc_io_compl(fsp);
- fc_fcp_unlock_pkt(fsp);
}
+ spin_unlock_bh(&fsp->scsi_pkt_lock);

fc_fcp_pkt_release(fsp);
spin_lock_irqsave(&si->scsi_queue_lock, flags);
--
1.9.1

2016-03-16 08:26:43

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 064/107] x86/ldt: Correct LDT access in single stepping logic

From: Juergen Gross <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 136d9d83c07c5e30ac49fc83b27e8c4842f108fc upstream.

Commit 37868fe113ff ("x86/ldt: Make modify_ldt synchronous")
introduced a new struct ldt_struct anchored at mm->context.ldt.

convert_ip_to_linear() was changed to reflect this, but indexing
into the ldt has to be changed as the pointer is no longer void *.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/kernel/step.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
index 5d7eccc..1565262 100644
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -28,11 +28,11 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
struct desc_struct *desc;
unsigned long base;

- seg &= ~7UL;
+ seg >>= 3;

mutex_lock(&child->mm->context.lock);
if (unlikely(!child->mm->context.ldt ||
- (seg >> 3) >= child->mm->context.ldt->size))
+ seg >= child->mm->context.ldt->size))
addr = -1L; /* bogus selector, access would fault */
else {
desc = &child->mm->context.ldt->entries[seg];
--
1.9.1

2016-03-16 08:26:34

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 068/107] libiscsi: Fix host busy blocking during connection teardown

From: John Soni Jose <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 660d0831d1494a6837b2f810d08b5be092c1f31d upstream.

In case of hw iscsi offload, an host can have N-number of active
connections. There can be IO's running on some connections which
make host->host_busy always TRUE. Now if logout from a connection
is tried then the code gets into an infinite loop as host->host_busy
is always TRUE.

iscsi_conn_teardown(....)
{
.........
/*
* Block until all in-progress commands for this connection
* time out or fail.
*/
for (;;) {
spin_lock_irqsave(session->host->host_lock, flags);
if (!atomic_read(&session->host->host_busy)) { /* OK for ERL == 0 */
spin_unlock_irqrestore(session->host->host_lock, flags);
break;
}
spin_unlock_irqrestore(session->host->host_lock, flags);
msleep_interruptible(500);
iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
"host_busy %d host_failed %d\n",
atomic_read(&session->host->host_busy),
session->host->host_failed);

................
...............
}
}

This is not an issue with software-iscsi/iser as each cxn is a separate
host.

Fix:
Acquiring eh_mutex in iscsi_conn_teardown() before setting
session->state = ISCSI_STATE_TERMINATE.

Signed-off-by: John Soni Jose <[email protected]>
Reviewed-by: Mike Christie <[email protected]>
Reviewed-by: Chris Leech <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-of-by: Zefan Li <[email protected]>
---
drivers/scsi/libiscsi.c | 25 ++-----------------------
1 file changed, 2 insertions(+), 23 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 1243d2f..d9a898c 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2907,10 +2907,10 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
{
struct iscsi_conn *conn = cls_conn->dd_data;
struct iscsi_session *session = conn->session;
- unsigned long flags;

del_timer_sync(&conn->transport_timer);

+ mutex_lock(&session->eh_mutex);
spin_lock_bh(&session->lock);
conn->c_stage = ISCSI_CONN_CLEANUP_WAIT;
if (session->leadconn == conn) {
@@ -2922,28 +2922,6 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
}
spin_unlock_bh(&session->lock);

- /*
- * Block until all in-progress commands for this connection
- * time out or fail.
- */
- for (;;) {
- spin_lock_irqsave(session->host->host_lock, flags);
- if (!session->host->host_busy) { /* OK for ERL == 0 */
- spin_unlock_irqrestore(session->host->host_lock, flags);
- break;
- }
- spin_unlock_irqrestore(session->host->host_lock, flags);
- msleep_interruptible(500);
- iscsi_conn_printk(KERN_INFO, conn, "iscsi conn_destroy(): "
- "host_busy %d host_failed %d\n",
- session->host->host_busy,
- session->host->host_failed);
- /*
- * force eh_abort() to unblock
- */
- wake_up(&conn->ehwait);
- }
-
/* flush queued up work because we free the connection below */
iscsi_suspend_tx(conn);

@@ -2956,6 +2934,7 @@ void iscsi_conn_teardown(struct iscsi_cls_conn *cls_conn)
if (session->leadconn == conn)
session->leadconn = NULL;
spin_unlock_bh(&session->lock);
+ mutex_unlock(&session->eh_mutex);

iscsi_destroy_conn(cls_conn);
}
--
1.9.1

2016-03-16 08:28:12

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 060/107] rds: fix an integer overflow test in rds_info_getsockopt()

From: Dan Carpenter <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 468b732b6f76b138c0926eadf38ac88467dcd271 upstream.

"len" is a signed integer. We check that len is not negative, so it
goes from zero to INT_MAX. PAGE_SIZE is unsigned long so the comparison
is type promoted to unsigned long. ULONG_MAX - 4095 is a higher than
INT_MAX so the condition can never be true.

I don't know if this is harmful but it seems safe to limit "len" to
INT_MAX - 4095.

Fixes: a8c879a7ee98 ('RDS: Info and stats')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/rds/info.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rds/info.c b/net/rds/info.c
index 9a6b4f6..140a44a 100644
--- a/net/rds/info.c
+++ b/net/rds/info.c
@@ -176,7 +176,7 @@ int rds_info_getsockopt(struct socket *sock, int optname, char __user *optval,

/* check for all kinds of wrapping and the like */
start = (unsigned long)optval;
- if (len < 0 || len + PAGE_SIZE - 1 < len || start + len < start) {
+ if (len < 0 || len > INT_MAX - PAGE_SIZE + 1 || start + len < start) {
ret = -EINVAL;
goto out;
}
--
1.9.1

2016-03-16 08:28:17

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 058/107] MIPS: Fix sched_getaffinity with MT FPAFF enabled

From: Felix Fietkau <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 1d62d737555e1378eb62a8bba26644f7d97139d2 upstream.

p->thread.user_cpus_allowed is zero-initialized and is only filled on
the first sched_setaffinity call.

To avoid adding overhead in the task initialization codepath, simply OR
the returned mask in sched_getaffinity with p->cpus_allowed.

Signed-off-by: Felix Fietkau <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/10740/
Signed-off-by: Ralf Baechle <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/mips/kernel/mips-mt-fpaff.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/mips/kernel/mips-mt-fpaff.c b/arch/mips/kernel/mips-mt-fpaff.c
index 33f63ba..c7e2684 100644
--- a/arch/mips/kernel/mips-mt-fpaff.c
+++ b/arch/mips/kernel/mips-mt-fpaff.c
@@ -154,7 +154,7 @@ asmlinkage long mipsmt_sys_sched_getaffinity(pid_t pid, unsigned int len,
unsigned long __user *user_mask_ptr)
{
unsigned int real_len;
- cpumask_t mask;
+ cpumask_t allowed, mask;
int retval;
struct task_struct *p;

@@ -173,7 +173,8 @@ asmlinkage long mipsmt_sys_sched_getaffinity(pid_t pid, unsigned int len,
if (retval)
goto out_unlock;

- cpumask_and(&mask, &p->thread.user_cpus_allowed, cpu_possible_mask);
+ cpumask_or(&allowed, &p->thread.user_cpus_allowed, &p->cpus_allowed);
+ cpumask_and(&mask, &allowed, cpu_active_mask);

out_unlock:
read_unlock(&tasklist_lock);
--
1.9.1

2016-03-16 08:28:22

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 056/107] x86/ldt: Make modify_ldt synchronous

From: Andy Lutomirski <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 37868fe113ff2ba814b3b4eb12df214df555f8dc upstream.

modify_ldt() has questionable locking and does not synchronize
threads. Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

This fixes some fallout from the CVE-2015-5157 fixes.

Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jan Beulich <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected] <[email protected]>
Cc: xen-devel <[email protected]>
Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
[bwh: Backported to 3.2:
- Adjust context
- Drop comment changes in switch_mm()
- Drop changes to get_segment_base() in arch/x86/kernel/cpu/perf_event.c
- Open-code lockless_dereference(), smp_store_release(), on_each_cpu_mask()]
Signed-off-by: Ben Hutchings <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
arch/x86/include/asm/desc.h | 15 ---
arch/x86/include/asm/mmu.h | 3 +-
arch/x86/include/asm/mmu_context.h | 49 ++++++-
arch/x86/kernel/cpu/common.c | 4 +-
arch/x86/kernel/ldt.c | 267 ++++++++++++++++++++-----------------
arch/x86/kernel/process_64.c | 4 +-
arch/x86/kernel/step.c | 6 +-
arch/x86/power/cpu.c | 3 +-
8 files changed, 205 insertions(+), 146 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index fa9c8c7..d34c94f 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -279,21 +279,6 @@ static inline void clear_LDT(void)
set_ldt(NULL, 0);
}

-/*
- * load one particular LDT into the current CPU
- */
-static inline void load_LDT_nolock(mm_context_t *pc)
-{
- set_ldt(pc->ldt, pc->size);
-}
-
-static inline void load_LDT(mm_context_t *pc)
-{
- preempt_disable();
- load_LDT_nolock(pc);
- preempt_enable();
-}
-
static inline unsigned long get_desc_base(const struct desc_struct *desc)
{
return (unsigned)(desc->base0 | ((desc->base1) << 16) | ((desc->base2) << 24));
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 5f55e69..926f672 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,8 +9,7 @@
* we put the segment information here.
*/
typedef struct {
- void *ldt;
- int size;
+ struct ldt_struct *ldt;

#ifdef CONFIG_X86_64
/* True if mm supports a task running in 32 bit compatibility mode. */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 6902152..ce4ea94 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -16,6 +16,51 @@ static inline void paravirt_activate_mm(struct mm_struct *prev,
#endif /* !CONFIG_PARAVIRT */

/*
+ * ldt_structs can be allocated, used, and freed, but they are never
+ * modified while live.
+ */
+struct ldt_struct {
+ /*
+ * Xen requires page-aligned LDTs with special permissions. This is
+ * needed to prevent us from installing evil descriptors such as
+ * call gates. On native, we could merge the ldt_struct and LDT
+ * allocations, but it's not worth trying to optimize.
+ */
+ struct desc_struct *entries;
+ int size;
+};
+
+static inline void load_mm_ldt(struct mm_struct *mm)
+{
+ struct ldt_struct *ldt;
+
+ /* smp_read_barrier_depends synchronizes with barrier in install_ldt */
+ ldt = ACCESS_ONCE(mm->context.ldt);
+ smp_read_barrier_depends();
+
+ /*
+ * Any change to mm->context.ldt is followed by an IPI to all
+ * CPUs with the mm active. The LDT will not be freed until
+ * after the IPI is handled by all such CPUs. This means that,
+ * if the ldt_struct changes before we return, the values we see
+ * will be safe, and the new values will be loaded before we run
+ * any user code.
+ *
+ * NB: don't try to convert this to use RCU without extreme care.
+ * We would still need IRQs off, because we don't want to change
+ * the local LDT after an IPI loaded a newer value than the one
+ * that we can see.
+ */
+
+ if (unlikely(ldt))
+ set_ldt(ldt->entries, ldt->size);
+ else
+ clear_LDT();
+
+ DEBUG_LOCKS_WARN_ON(preemptible());
+}
+
+/*
* Used for LDT copy/destruction.
*/
int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
@@ -52,7 +97,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
* load the LDT, if the LDT is different:
*/
if (unlikely(prev->context.ldt != next->context.ldt))
- load_LDT_nolock(&next->context);
+ load_mm_ldt(next);
}
#ifdef CONFIG_SMP
else {
@@ -65,7 +110,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
* to make sure to use no freed page tables.
*/
load_cr3(next->pgd);
- load_LDT_nolock(&next->context);
+ load_mm_ldt(next);
}
}
#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 114db0f..b190a62 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1254,7 +1254,7 @@ void __cpuinit cpu_init(void)
load_sp0(t, &current->thread);
set_tss_desc(cpu, t);
load_TR_desc();
- load_LDT(&init_mm.context);
+ load_mm_ldt(&init_mm);

clear_all_debug_regs();
dbg_restore_debug_regs();
@@ -1302,7 +1302,7 @@ void __cpuinit cpu_init(void)
load_sp0(t, thread);
set_tss_desc(cpu, t);
load_TR_desc();
- load_LDT(&init_mm.context);
+ load_mm_ldt(&init_mm);

t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);

diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index c37886d..fba5131 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -12,6 +12,7 @@
#include <linux/string.h>
#include <linux/mm.h>
#include <linux/smp.h>
+#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/uaccess.h>

@@ -20,82 +21,87 @@
#include <asm/mmu_context.h>
#include <asm/syscalls.h>

-#ifdef CONFIG_SMP
+/* context.lock is held for us, so we don't need any locking. */
static void flush_ldt(void *current_mm)
{
- if (current->active_mm == current_mm)
- load_LDT(&current->active_mm->context);
+ mm_context_t *pc;
+
+ if (current->active_mm != current_mm)
+ return;
+
+ pc = &current->active_mm->context;
+ set_ldt(pc->ldt->entries, pc->ldt->size);
}
-#endif

-static int alloc_ldt(mm_context_t *pc, int mincount, int reload)
+/* The caller must call finalize_ldt_struct on the result. LDT starts zeroed. */
+static struct ldt_struct *alloc_ldt_struct(int size)
{
- void *oldldt, *newldt;
- int oldsize;
-
- if (mincount <= pc->size)
- return 0;
- oldsize = pc->size;
- mincount = (mincount + (PAGE_SIZE / LDT_ENTRY_SIZE - 1)) &
- (~(PAGE_SIZE / LDT_ENTRY_SIZE - 1));
- if (mincount * LDT_ENTRY_SIZE > PAGE_SIZE)
- newldt = vmalloc(mincount * LDT_ENTRY_SIZE);
+ struct ldt_struct *new_ldt;
+ int alloc_size;
+
+ if (size > LDT_ENTRIES)
+ return NULL;
+
+ new_ldt = kmalloc(sizeof(struct ldt_struct), GFP_KERNEL);
+ if (!new_ldt)
+ return NULL;
+
+ BUILD_BUG_ON(LDT_ENTRY_SIZE != sizeof(struct desc_struct));
+ alloc_size = size * LDT_ENTRY_SIZE;
+
+ /*
+ * Xen is very picky: it requires a page-aligned LDT that has no
+ * trailing nonzero bytes in any page that contains LDT descriptors.
+ * Keep it simple: zero the whole allocation and never allocate less
+ * than PAGE_SIZE.
+ */
+ if (alloc_size > PAGE_SIZE)
+ new_ldt->entries = vzalloc(alloc_size);
else
- newldt = (void *)__get_free_page(GFP_KERNEL);
-
- if (!newldt)
- return -ENOMEM;
+ new_ldt->entries = kzalloc(PAGE_SIZE, GFP_KERNEL);

- if (oldsize)
- memcpy(newldt, pc->ldt, oldsize * LDT_ENTRY_SIZE);
- oldldt = pc->ldt;
- memset(newldt + oldsize * LDT_ENTRY_SIZE, 0,
- (mincount - oldsize) * LDT_ENTRY_SIZE);
+ if (!new_ldt->entries) {
+ kfree(new_ldt);
+ return NULL;
+ }

- paravirt_alloc_ldt(newldt, mincount);
+ new_ldt->size = size;
+ return new_ldt;
+}

-#ifdef CONFIG_X86_64
- /* CHECKME: Do we really need this ? */
- wmb();
-#endif
- pc->ldt = newldt;
- wmb();
- pc->size = mincount;
- wmb();
-
- if (reload) {
-#ifdef CONFIG_SMP
- preempt_disable();
- load_LDT(pc);
- if (!cpumask_equal(mm_cpumask(current->mm),
- cpumask_of(smp_processor_id())))
- smp_call_function(flush_ldt, current->mm, 1);
- preempt_enable();
-#else
- load_LDT(pc);
-#endif
- }
- if (oldsize) {
- paravirt_free_ldt(oldldt, oldsize);
- if (oldsize * LDT_ENTRY_SIZE > PAGE_SIZE)
- vfree(oldldt);
- else
- put_page(virt_to_page(oldldt));
- }
- return 0;
+/* After calling this, the LDT is immutable. */
+static void finalize_ldt_struct(struct ldt_struct *ldt)
+{
+ paravirt_alloc_ldt(ldt->entries, ldt->size);
}

-static inline int copy_ldt(mm_context_t *new, mm_context_t *old)
+/* context.lock is held */
+static void install_ldt(struct mm_struct *current_mm,
+ struct ldt_struct *ldt)
{
- int err = alloc_ldt(new, old->size, 0);
- int i;
+ /* Synchronizes with smp_read_barrier_depends in load_mm_ldt. */
+ barrier();
+ ACCESS_ONCE(current_mm->context.ldt) = ldt;
+
+ /* Activate the LDT for all CPUs using current_mm. */
+ smp_call_function_many(mm_cpumask(current_mm), flush_ldt, current_mm,
+ true);
+ local_irq_disable();
+ flush_ldt(current_mm);
+ local_irq_enable();
+}

- if (err < 0)
- return err;
+static void free_ldt_struct(struct ldt_struct *ldt)
+{
+ if (likely(!ldt))
+ return;

- for (i = 0; i < old->size; i++)
- write_ldt_entry(new->ldt, i, old->ldt + i * LDT_ENTRY_SIZE);
- return 0;
+ paravirt_free_ldt(ldt->entries, ldt->size);
+ if (ldt->size * LDT_ENTRY_SIZE > PAGE_SIZE)
+ vfree(ldt->entries);
+ else
+ kfree(ldt->entries);
+ kfree(ldt);
}

/*
@@ -104,17 +110,37 @@ static inline int copy_ldt(mm_context_t *new, mm_context_t *old)
*/
int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
{
+ struct ldt_struct *new_ldt;
struct mm_struct *old_mm;
int retval = 0;

mutex_init(&mm->context.lock);
- mm->context.size = 0;
old_mm = current->mm;
- if (old_mm && old_mm->context.size > 0) {
- mutex_lock(&old_mm->context.lock);
- retval = copy_ldt(&mm->context, &old_mm->context);
- mutex_unlock(&old_mm->context.lock);
+ if (!old_mm) {
+ mm->context.ldt = NULL;
+ return 0;
+ }
+
+ mutex_lock(&old_mm->context.lock);
+ if (!old_mm->context.ldt) {
+ mm->context.ldt = NULL;
+ goto out_unlock;
}
+
+ new_ldt = alloc_ldt_struct(old_mm->context.ldt->size);
+ if (!new_ldt) {
+ retval = -ENOMEM;
+ goto out_unlock;
+ }
+
+ memcpy(new_ldt->entries, old_mm->context.ldt->entries,
+ new_ldt->size * LDT_ENTRY_SIZE);
+ finalize_ldt_struct(new_ldt);
+
+ mm->context.ldt = new_ldt;
+
+out_unlock:
+ mutex_unlock(&old_mm->context.lock);
return retval;
}

@@ -125,53 +151,47 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
*/
void destroy_context(struct mm_struct *mm)
{
- if (mm->context.size) {
-#ifdef CONFIG_X86_32
- /* CHECKME: Can this ever happen ? */
- if (mm == current->active_mm)
- clear_LDT();
-#endif
- paravirt_free_ldt(mm->context.ldt, mm->context.size);
- if (mm->context.size * LDT_ENTRY_SIZE > PAGE_SIZE)
- vfree(mm->context.ldt);
- else
- put_page(virt_to_page(mm->context.ldt));
- mm->context.size = 0;
- }
+ free_ldt_struct(mm->context.ldt);
+ mm->context.ldt = NULL;
}

static int read_ldt(void __user *ptr, unsigned long bytecount)
{
- int err;
+ int retval;
unsigned long size;
struct mm_struct *mm = current->mm;

- if (!mm->context.size)
- return 0;
+ mutex_lock(&mm->context.lock);
+
+ if (!mm->context.ldt) {
+ retval = 0;
+ goto out_unlock;
+ }
+
if (bytecount > LDT_ENTRY_SIZE * LDT_ENTRIES)
bytecount = LDT_ENTRY_SIZE * LDT_ENTRIES;

- mutex_lock(&mm->context.lock);
- size = mm->context.size * LDT_ENTRY_SIZE;
+ size = mm->context.ldt->size * LDT_ENTRY_SIZE;
if (size > bytecount)
size = bytecount;

- err = 0;
- if (copy_to_user(ptr, mm->context.ldt, size))
- err = -EFAULT;
- mutex_unlock(&mm->context.lock);
- if (err < 0)
- goto error_return;
+ if (copy_to_user(ptr, mm->context.ldt->entries, size)) {
+ retval = -EFAULT;
+ goto out_unlock;
+ }
+
if (size != bytecount) {
- /* zero-fill the rest */
- if (clear_user(ptr + size, bytecount - size) != 0) {
- err = -EFAULT;
- goto error_return;
+ /* Zero-fill the rest and pretend we read bytecount bytes. */
+ if (clear_user(ptr + size, bytecount - size)) {
+ retval = -EFAULT;
+ goto out_unlock;
}
}
- return bytecount;
-error_return:
- return err;
+ retval = bytecount;
+
+out_unlock:
+ mutex_unlock(&mm->context.lock);
+ return retval;
}

static int read_default_ldt(void __user *ptr, unsigned long bytecount)
@@ -195,6 +215,8 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode)
struct desc_struct ldt;
int error;
struct user_desc ldt_info;
+ int oldsize, newsize;
+ struct ldt_struct *new_ldt, *old_ldt;

error = -EINVAL;
if (bytecount != sizeof(ldt_info))
@@ -213,34 +235,39 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode)
goto out;
}

- mutex_lock(&mm->context.lock);
- if (ldt_info.entry_number >= mm->context.size) {
- error = alloc_ldt(&current->mm->context,
- ldt_info.entry_number + 1, 1);
- if (error < 0)
- goto out_unlock;
- }
-
- /* Allow LDTs to be cleared by the user. */
- if (ldt_info.base_addr == 0 && ldt_info.limit == 0) {
- if (oldmode || LDT_empty(&ldt_info)) {
- memset(&ldt, 0, sizeof(ldt));
- goto install;
+ if ((oldmode && !ldt_info.base_addr && !ldt_info.limit) ||
+ LDT_empty(&ldt_info)) {
+ /* The user wants to clear the entry. */
+ memset(&ldt, 0, sizeof(ldt));
+ } else {
+ if (!IS_ENABLED(CONFIG_X86_16BIT) && !ldt_info.seg_32bit) {
+ error = -EINVAL;
+ goto out;
}
+
+ fill_ldt(&ldt, &ldt_info);
+ if (oldmode)
+ ldt.avl = 0;
}

- if (!IS_ENABLED(CONFIG_X86_16BIT) && !ldt_info.seg_32bit) {
- error = -EINVAL;
+ mutex_lock(&mm->context.lock);
+
+ old_ldt = mm->context.ldt;
+ oldsize = old_ldt ? old_ldt->size : 0;
+ newsize = max((int)(ldt_info.entry_number + 1), oldsize);
+
+ error = -ENOMEM;
+ new_ldt = alloc_ldt_struct(newsize);
+ if (!new_ldt)
goto out_unlock;
- }

- fill_ldt(&ldt, &ldt_info);
- if (oldmode)
- ldt.avl = 0;
+ if (old_ldt)
+ memcpy(new_ldt->entries, old_ldt->entries, oldsize * LDT_ENTRY_SIZE);
+ new_ldt->entries[ldt_info.entry_number] = ldt;
+ finalize_ldt_struct(new_ldt);

- /* Install the new entry ... */
-install:
- write_ldt_entry(mm->context.ldt, ldt_info.entry_number, &ldt);
+ install_ldt(mm, new_ldt);
+ free_ldt_struct(old_ldt);
error = 0;

out_unlock:
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index bb390e1..3ebca08 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -116,11 +116,11 @@ void __show_regs(struct pt_regs *regs, int all)
void release_thread(struct task_struct *dead_task)
{
if (dead_task->mm) {
- if (dead_task->mm->context.size) {
+ if (dead_task->mm->context.ldt) {
printk("WARNING: dead process %8s still has LDT? <%p/%d>\n",
dead_task->comm,
dead_task->mm->context.ldt,
- dead_task->mm->context.size);
+ dead_task->mm->context.ldt->size);
BUG();
}
}
diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
index f89cdc6..5d7eccc 100644
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -5,6 +5,7 @@
#include <linux/mm.h>
#include <linux/ptrace.h>
#include <asm/desc.h>
+#include <asm/mmu_context.h>

unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs)
{
@@ -30,10 +31,11 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
seg &= ~7UL;

mutex_lock(&child->mm->context.lock);
- if (unlikely((seg >> 3) >= child->mm->context.size))
+ if (unlikely(!child->mm->context.ldt ||
+ (seg >> 3) >= child->mm->context.ldt->size))
addr = -1L; /* bogus selector, access would fault */
else {
- desc = child->mm->context.ldt + seg;
+ desc = &child->mm->context.ldt->entries[seg];
base = get_desc_base(desc);

/* 16-bit code segment? */
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index fcbaac60..dd298e7 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -22,6 +22,7 @@
#include <asm/suspend.h>
#include <asm/debugreg.h>
#include <asm/fpu-internal.h> /* pcntxt_mask */
+#include <asm/mmu_context.h>

#ifdef CONFIG_X86_32
static struct saved_context saved_context;
@@ -148,7 +149,7 @@ static void fix_processor_context(void)
syscall_init(); /* This sets MSR_*STAR and related */
#endif
load_TR_desc(); /* This does ltr */
- load_LDT(&current->active_mm->context); /* This does lldt */
+ load_mm_ldt(current->active_mm); /* This does lldt */
}

/**
--
1.9.1

2016-03-16 08:28:09

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 061/107] perf: Fix fasync handling on inherited events

From: Peter Zijlstra <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit fed66e2cdd4f127a43fd11b8d92a99bdd429528c upstream.

Vince reported that the fasync signal stuff doesn't work proper for
inherited events. So fix that.

Installing fasync allocates memory and sets filp->f_flags |= FASYNC,
which upon the demise of the file descriptor ensures the allocation is
freed and state is updated.

Now for perf, we can have the events stick around for a while after the
original FD is dead because of references from child events. So we
cannot copy the fasync pointer around. We can however consistently use
the parent's fasync, as that will be updated.

Reported-and-Tested-by: Vince Weaver <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Arnaldo Carvalho deMelo <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/1434011521.1495.71.camel@twins
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
kernel/events/core.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 461b6e0..2e6c2484 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3863,12 +3863,20 @@ static const struct file_operations perf_fops = {
* to user-space before waking everybody up.
*/

+static inline struct fasync_struct **perf_event_fasync(struct perf_event *event)
+{
+ /* only the parent has fasync state */
+ if (event->parent)
+ event = event->parent;
+ return &event->fasync;
+}
+
void perf_event_wakeup(struct perf_event *event)
{
ring_buffer_wakeup(event);

if (event->pending_kill) {
- kill_fasync(&event->fasync, SIGIO, event->pending_kill);
+ kill_fasync(perf_event_fasync(event), SIGIO, event->pending_kill);
event->pending_kill = 0;
}
}
@@ -4879,7 +4887,7 @@ static int __perf_event_overflow(struct perf_event *event,
else
perf_event_output(event, data, regs);

- if (event->fasync && event->pending_kill) {
+ if (*perf_event_fasync(event) && event->pending_kill) {
event->pending_wakeup = 1;
irq_work_queue(&event->pending);
}
--
1.9.1

2016-03-16 08:28:04

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 062/107] MIPS: Make set_pte() SMP safe.

From: David Daney <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 46011e6ea39235e4aca656673c500eac81a07a17 upstream.

On MIPS the GLOBAL bit of the PTE must have the same value in any
aligned pair of PTEs. These pairs of PTEs are referred to as
"buddies". In a SMP system is is possible for two CPUs to be calling
set_pte() on adjacent PTEs at the same time. There is a race between
setting the PTE and a different CPU setting the GLOBAL bit in its
buddy PTE.

This race can be observed when multiple CPUs are executing
vmap()/vfree() at the same time.

Make setting the buddy PTE's GLOBAL bit an atomic operation to close
the race condition.

The case of CONFIG_64BIT_PHYS_ADDR && CONFIG_CPU_MIPS32 is *not*
handled.

Signed-off-by: David Daney <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/10835/
Signed-off-by: Ralf Baechle <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/mips/include/asm/pgtable.h | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index b2202a6..95bcedb 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -153,8 +153,39 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)
* Make sure the buddy is global too (if it's !none,
* it better already be global)
*/
+#ifdef CONFIG_SMP
+ /*
+ * For SMP, multiple CPUs can race, so we need to do
+ * this atomically.
+ */
+#ifdef CONFIG_64BIT
+#define LL_INSN "lld"
+#define SC_INSN "scd"
+#else /* CONFIG_32BIT */
+#define LL_INSN "ll"
+#define SC_INSN "sc"
+#endif
+ unsigned long page_global = _PAGE_GLOBAL;
+ unsigned long tmp;
+
+ __asm__ __volatile__ (
+ " .set push\n"
+ " .set noreorder\n"
+ "1: " LL_INSN " %[tmp], %[buddy]\n"
+ " bnez %[tmp], 2f\n"
+ " or %[tmp], %[tmp], %[global]\n"
+ " " SC_INSN " %[tmp], %[buddy]\n"
+ " beqz %[tmp], 1b\n"
+ " nop\n"
+ "2:\n"
+ " .set pop"
+ : [buddy] "+m" (buddy->pte),
+ [tmp] "=&r" (tmp)
+ : [global] "r" (page_global));
+#else /* !CONFIG_SMP */
if (pte_none(*buddy))
pte_val(*buddy) = pte_val(*buddy) | _PAGE_GLOBAL;
+#endif /* CONFIG_SMP */
}
#endif
}
--
1.9.1

2016-03-16 08:28:01

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 063/107] ocfs2: fix BUG in ocfs2_downconvert_thread_do_work()

From: Joseph Qi <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 209f7512d007980fd111a74a064d70a3656079cf upstream.

The "BUG_ON(list_empty(&osb->blocked_lock_list))" in
ocfs2_downconvert_thread_do_work can be triggered in the following case:

ocfs2dc has firstly saved osb->blocked_lock_count to local varibale
processed, and then processes the dentry lockres. During the dentry
put, it calls iput and then deletes rw, inode and open lockres from
blocked list in ocfs2_mark_lockres_freeing. And this causes the
variable `processed' to not reflect the number of blocked lockres to be
processed, which triggers the BUG.

Signed-off-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
fs/ocfs2/dlmglue.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 231eab2..b5e457c 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -3968,9 +3968,13 @@ static void ocfs2_downconvert_thread_do_work(struct ocfs2_super *osb)
osb->dc_work_sequence = osb->dc_wake_sequence;

processed = osb->blocked_lock_count;
- while (processed) {
- BUG_ON(list_empty(&osb->blocked_lock_list));
-
+ /*
+ * blocked lock processing in this loop might call iput which can
+ * remove items off osb->blocked_lock_list. Downconvert up to
+ * 'processed' number of locks, but stop short if we had some
+ * removed in ocfs2_mark_lockres_freeing when downconverting.
+ */
+ while (processed && !list_empty(&osb->blocked_lock_list)) {
lockres = list_entry(osb->blocked_lock_list.next,
struct ocfs2_lock_res, l_blocked_list);
list_del_init(&lockres->l_blocked_list);
--
1.9.1

2016-03-16 08:30:16

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 057/107] md/raid1: extend spinlock to protect raid1_end_read_request against inconsistencies

From: NeilBrown <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 423f04d63cf421ea436bcc5be02543d549ce4b28 upstream.

raid1_end_read_request() assumes that the In_sync bits are consistent
with the ->degaded count.
raid1_spare_active updates the In_sync bit before the ->degraded count
and so exposes an inconsistency, as does error()
So extend the spinlock in raid1_spare_active() and error() to hide those
inconsistencies.

This should probably be part of
Commit: 34cab6f42003 ("md/raid1: fix test for 'was read error from
last working device'.")
as it addresses the same issue. It fixes the same bug and should go
to -stable for same reasons.

Fixes: 76073054c95b ("md/raid1: clean up read_balance.")
Signed-off-by: NeilBrown <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/raid1.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 27af2f3..189eedb 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1250,6 +1250,7 @@ static void error(struct mddev *mddev, struct md_rdev *rdev)
{
char b[BDEVNAME_SIZE];
struct r1conf *conf = mddev->private;
+ unsigned long flags;

/*
* If it is not operational, then we have already marked it as dead
@@ -1269,6 +1270,7 @@ static void error(struct mddev *mddev, struct md_rdev *rdev)
return;
}
set_bit(Blocked, &rdev->flags);
+ spin_lock_irqsave(&conf->device_lock, flags);
if (test_and_clear_bit(In_sync, &rdev->flags)) {
unsigned long flags;
spin_lock_irqsave(&conf->device_lock, flags);
@@ -1281,6 +1283,7 @@ static void error(struct mddev *mddev, struct md_rdev *rdev)
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
} else
set_bit(Faulty, &rdev->flags);
+ spin_unlock_irqrestore(&conf->device_lock, flags);
set_bit(MD_CHANGE_DEVS, &mddev->flags);
printk(KERN_ALERT
"md/raid1:%s: Disk failure on %s, disabling device.\n"
@@ -1334,7 +1337,10 @@ static int raid1_spare_active(struct mddev *mddev)
* Find all failed disks within the RAID1 configuration
* and mark them readable.
* Called under mddev lock, so rcu protection not needed.
+ * device_lock used to avoid races with raid1_end_read_request
+ * which expects 'In_sync' flags and ->degraded to be consistent.
*/
+ spin_lock_irqsave(&conf->device_lock, flags);
for (i = 0; i < conf->raid_disks; i++) {
struct md_rdev *rdev = conf->mirrors[i].rdev;
struct md_rdev *repl = conf->mirrors[conf->raid_disks + i].rdev;
@@ -1364,7 +1370,6 @@ static int raid1_spare_active(struct mddev *mddev)
sysfs_notify_dirent_safe(rdev->sysfs_state);
}
}
- spin_lock_irqsave(&conf->device_lock, flags);
mddev->degraded -= count;
spin_unlock_irqrestore(&conf->device_lock, flags);

--
1.9.1

2016-03-16 08:31:32

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 052/107] ALSA: usb-audio: add dB range mapping for some devices

From: Yao-Wen Mao <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 2d1cb7f658fb9c3ba8f9dab8aca297d4dfdec835 upstream.

Add the correct dB ranges of Bose Companion 5 and Drangonfly DAC 1.2.

Signed-off-by: Yao-Wen Mao <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
sound/usb/mixer_maps.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/sound/usb/mixer_maps.c b/sound/usb/mixer_maps.c
index 851786f..893b750 100644
--- a/sound/usb/mixer_maps.c
+++ b/sound/usb/mixer_maps.c
@@ -312,6 +312,20 @@ static const struct usbmix_name_map scms_usb3318_map[] = {
{ 0 }
};

+/* Bose companion 5, the dB conversion factor is 16 instead of 256 */
+static struct usbmix_dB_map bose_companion5_dB = {-5006, -6};
+static struct usbmix_name_map bose_companion5_map[] = {
+ { 3, NULL, .dB = &bose_companion5_dB },
+ { 0 } /* terminator */
+};
+
+/* Dragonfly DAC 1.2, the dB conversion factor is 1 instead of 256 */
+static struct usbmix_dB_map dragonfly_1_2_dB = {0, 5000};
+static struct usbmix_name_map dragonfly_1_2_map[] = {
+ { 7, NULL, .dB = &dragonfly_1_2_dB },
+ { 0 } /* terminator */
+};
+
/*
* Control map entries
*/
@@ -394,6 +408,16 @@ static struct usbmix_ctl_map usbmix_ctl_maps[] = {
.id = USB_ID(0x25c4, 0x0003),
.map = scms_usb3318_map,
},
+ {
+ /* Bose Companion 5 */
+ .id = USB_ID(0x05a7, 0x1020),
+ .map = bose_companion5_map,
+ },
+ {
+ /* Dragonfly DAC 1.2 */
+ .id = USB_ID(0x21b4, 0x0081),
+ .map = dragonfly_1_2_map,
+ },
{ 0 } /* terminator */
};

--
1.9.1

2016-03-16 08:09:56

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 043/107] Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen

From: Bernhard Bender <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 968491709e5b1aaf429428814fff3d932fa90b60 upstream.

This patch fixes a problem in the usbtouchscreen driver for DMC TSC-30
touch screen. Due to a missing delay between the RESET and SET_RATE
commands, the touch screen may become unresponsive during system startup or
driver loading.

According to the DMC documentation, a delay is needed after the RESET
command to allow the chip to complete its internal initialization. As this
delay is not guaranteed, we had a system where the touch screen
occasionally did not send any touch data. There was no other indication of
the problem.

The patch fixes the problem by adding a 150ms delay between the RESET and
SET_RATE commands.

Suggested-by: Jakob Mustafa <[email protected]>
Signed-off-by: Bernhard Bender <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/input/touchscreen/usbtouchscreen.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/input/touchscreen/usbtouchscreen.c b/drivers/input/touchscreen/usbtouchscreen.c
index ce384a4..e57eaf8 100644
--- a/drivers/input/touchscreen/usbtouchscreen.c
+++ b/drivers/input/touchscreen/usbtouchscreen.c
@@ -586,6 +586,9 @@ static int dmc_tsc10_init(struct usbtouch_usb *usbtouch)
goto err_out;
}

+ /* TSC-25 data sheet specifies a delay after the RESET command */
+ msleep(150);
+
/* set coordinate output rate */
buf[0] = buf[1] = 0xFF;
ret = usb_control_msg(dev, usb_rcvctrlpipe (dev, 0),
--
1.9.1

2016-03-16 08:31:59

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 051/107] USB: sierra: add 1199:68AB device ID

From: Dirk Behme <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 74472233233f577eaa0ca6d6e17d9017b6e53150 upstream.

Add support for the Sierra Wireless AR8550 device with
USB descriptor 0x1199, 0x68AB.

It is common with MC879x modules 1199:683c/683d which
also are composite devices with 7 interfaces (0..6)
and also MDM62xx based as the AR8550.

The major difference are only the interface attributes
02/02/01 on interfaces 3 and 4 on the AR8550. They are
vendor specific ff/ff/ff on MC879x modules.

lsusb reports:

Bus 001 Device 004: ID 1199:68ab Sierra Wireless, Inc.
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x1199 Sierra Wireless, Inc.
idProduct 0x68ab
bcdDevice 0.06
iManufacturer 3 Sierra Wireless, Incorporated
iProduct 2 AR8550
iSerial 0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 198
bNumInterfaces 7
bConfigurationValue 1
iConfiguration 1 Sierra Configuration
bmAttributes 0xe0
Self Powered
Remote Wakeup
MaxPower 0mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 255 Vendor Specific Subclass
bInterfaceProtocol 255 Vendor Specific Protocol
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 255 Vendor Specific Subclass
bInterfaceProtocol 255 Vendor Specific Protocol
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02 EP 2 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 2
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 255 Vendor Specific Subclass
bInterfaceProtocol 255 Vendor Specific Protocol
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x03 EP 3 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 3
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x85 EP 5 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x04 EP 4 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 4
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x86 EP 6 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x87 EP 7 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x05 EP 5 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 5
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 255 Vendor Specific Subclass
bInterfaceProtocol 255 Vendor Specific Protocol
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x88 EP 8 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x89 EP 9 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x06 EP 6 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 6
bAlternateSetting 0
bNumEndpoints 3
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 255 Vendor Specific Subclass
bInterfaceProtocol 255 Vendor Specific Protocol
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x8a EP 10 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x8b EP 11 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x07 EP 7 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 32
Device Qualifier (for other device speed):
bLength 10
bDescriptorType 6
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
bNumConfigurations 1
Device Status: 0x0001
Self Powered

Signed-off-by: Dirk Behme <[email protected]>
Cc: Lars Melin <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/serial/sierra.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/usb/serial/sierra.c b/drivers/usb/serial/sierra.c
index e3ddec0..dd0ca84 100644
--- a/drivers/usb/serial/sierra.c
+++ b/drivers/usb/serial/sierra.c
@@ -303,6 +303,7 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x68AA, 0xFF, 0xFF, 0xFF),
.driver_info = (kernel_ulong_t)&direct_ip_interface_blacklist
},
+ { USB_DEVICE(0x1199, 0x68AB) }, /* Sierra Wireless AR8550 */
/* AT&T Direct IP LTE modems */
{ USB_DEVICE_AND_INTERFACE_INFO(0x0F3D, 0x68AA, 0xFF, 0xFF, 0xFF),
.driver_info = (kernel_ulong_t)&direct_ip_interface_blacklist
--
1.9.1

2016-03-16 08:32:23

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 049/107] niu: don't count tx error twice in case of headroom realloc fails

From: Jiri Pirko <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 42288830494cd51873ca745a7a229023df061226 upstream.

Fixes: a3138df9 ("[NIU]: Add Sun Neptune ethernet driver.")
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/net/ethernet/sun/niu.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index 8489d09..e475f85 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -6659,10 +6659,8 @@ static netdev_tx_t niu_start_xmit(struct sk_buff *skb,
struct sk_buff *skb_new;

skb_new = skb_realloc_headroom(skb, len);
- if (!skb_new) {
- rp->tx_errors++;
+ if (!skb_new)
goto out_drop;
- }
kfree_skb(skb);
skb = skb_new;
} else
--
1.9.1

2016-03-16 08:33:04

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 050/107] vhost: actually track log eventfd file

From: Marc-André Lureau <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 7932c0bd7740f4cd2aa168d3ce0199e7af7d72d5 upstream.

While reviewing vhost log code, I found out that log_file is never
set. Note: I haven't tested the change (QEMU doesn't use LOG_FD yet).

Signed-off-by: Marc-André Lureau <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/vhost/vhost.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index a50cb9c..8b2bac4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -884,6 +884,7 @@ long vhost_dev_ioctl(struct vhost_dev *d, unsigned int ioctl, unsigned long arg)
}
if (eventfp != d->log_file) {
filep = d->log_file;
+ d->log_file = eventfp;
ctx = d->log_ctx;
d->log_ctx = eventfp ?
eventfd_ctx_fileget(eventfp) : NULL;
--
1.9.1

2016-03-16 08:33:27

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 047/107] crypto: ixp4xx - Remove bogus BUG_ON on scattered dst buffer

From: Herbert Xu <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit f898c522f0e9ac9f3177d0762b76e2ab2d2cf9c0 upstream.

This patch removes a bogus BUG_ON in the ablkcipher path that
triggers when the destination buffer is different from the source
buffer and is scattered.

Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/crypto/ixp4xx_crypto.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/crypto/ixp4xx_crypto.c b/drivers/crypto/ixp4xx_crypto.c
index 8f3f74c..f731abc 100644
--- a/drivers/crypto/ixp4xx_crypto.c
+++ b/drivers/crypto/ixp4xx_crypto.c
@@ -915,7 +915,6 @@ static int ablk_perform(struct ablkcipher_request *req, int encrypt)
crypt->mode |= NPE_OP_NOT_IN_PLACE;
/* This was never tested by Intel
* for more than one dst buffer, I think. */
- BUG_ON(req->dst->length < nbytes);
req_ctx->dst = NULL;
if (!chainup_buffers(dev, req->dst, nbytes, &dst_hook,
flags, DMA_FROM_DEVICE))
--
1.9.1

2016-03-16 08:33:51

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 046/107] netfilter: nf_conntrack: Support expectations in different zones

From: Joe Stringer <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 upstream.

When zones were originally introduced, the expectation functions were
all extended to perform lookup using the zone. However, insertion was
not modified to check the zone. This means that two expectations which
are intended to apply for different connections that have the same tuple
but exist in different zones cannot both be tracked.

Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones")
Signed-off-by: Joe Stringer <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/netfilter/nf_conntrack_expect.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index e41ec84..6fedfb3 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -203,7 +203,8 @@ static inline int expect_clash(const struct nf_conntrack_expect *a,
a->mask.src.u3.all[count] & b->mask.src.u3.all[count];
}

- return nf_ct_tuple_mask_cmp(&a->tuple, &b->tuple, &intersect_mask);
+ return nf_ct_tuple_mask_cmp(&a->tuple, &b->tuple, &intersect_mask) &&
+ nf_ct_zone(a->master) == nf_ct_zone(b->master);
}

static inline int expect_matches(const struct nf_conntrack_expect *a,
--
1.9.1

2016-03-16 08:33:55

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 045/107] mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()

From: Tomas Winkler <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 9098f84cced870f54d8c410dd2444cfa61467fa0 upstream.

Enclosing mmc_blk_put() is missing in power_ro_lock_show() sysfs handler,
let's add it.

Fixes: add710eaa886 ("mmc: boot partition ro lock support")
Signed-off-by: Tomas Winkler <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/mmc/card/block.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 47a789e..b4277ac 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -182,6 +182,8 @@ static ssize_t power_ro_lock_show(struct device *dev,

ret = snprintf(buf, PAGE_SIZE, "%d\n", locked);

+ mmc_blk_put(md);
+
return ret;
}

--
1.9.1

2016-03-16 08:34:55

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 042/107] tile: use free_bootmem_late() for initrd

From: Chris Metcalf <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 3f81d2447b37ac697b3c600039f2c6b628c06e21 upstream.

We were previously using free_bootmem() and just getting lucky
that nothing too bad happened.

Signed-off-by: Chris Metcalf <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/tile/kernel/setup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/tile/kernel/setup.c b/arch/tile/kernel/setup.c
index fd107ab..c40f806 100644
--- a/arch/tile/kernel/setup.c
+++ b/arch/tile/kernel/setup.c
@@ -972,7 +972,7 @@ static void __init load_hv_initrd(void)

void __init free_initrd_mem(unsigned long begin, unsigned long end)
{
- free_bootmem(__pa(begin), end - begin);
+ free_bootmem_late(__pa(begin), end - begin);
}

#else
--
1.9.1

2016-03-16 08:35:15

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 039/107] xhci: report U3 when link is in resume state

From: Zhuang Jin Can <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 243292a2ad3dc365849b820a64868927168894ac upstream.

xhci_hub_report_usb3_link_state() returns pls as U0 when the link
is in resume state, and this causes usb core to think the link is in
U0 while actually it's in resume state. When usb core transfers
control request on the link, it fails with TRB error as the link
is not ready for transfer.

To fix the issue, report U3 when the link is in resume state, thus
usb core knows the link it's not ready for transfer.

Signed-off-by: Zhuang Jin Can <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/host/xhci-hub.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index a6d4393..9be0c29 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -473,10 +473,13 @@ static void xhci_hub_report_link_state(struct xhci_hcd *xhci,
u32 pls = status_reg & PORT_PLS_MASK;

/* resume state is a xHCI internal state.
- * Do not report it to usb core.
+ * Do not report it to usb core, instead, pretend to be U3,
+ * thus usb core knows it's not ready for transfer
*/
- if (pls == XDEV_RESUME)
+ if (pls == XDEV_RESUME) {
+ *status |= USB_SS_PORT_LS_U3;
return;
+ }

/* When the CAS bit is set then warm reset
* should be performed on port
--
1.9.1

2016-03-16 08:35:20

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 041/107] usb-storage: ignore ZTE MF 823 card reader in mode 0x1225

From: Oliver Neukum <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 5fb2c782f451a4fb9c19c076e2c442839faf0f76 upstream.

This device automatically switches itself to another mode (0x1405)
unless the specific access pattern of Windows is followed in its
initial mode. That makes a dirty unmount of the internal storage
devices inevitable if they are mounted. So the card reader of
such a device should be ignored, lest an unclean removal become
inevitable.

This replaces an earlier patch that ignored all LUNs of this device.
That patch was overly broad.

Signed-off-by: Oliver Neukum <[email protected]>
Reviewed-by: Lars Melin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/storage/unusual_devs.h | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/drivers/usb/storage/unusual_devs.h b/drivers/usb/storage/unusual_devs.h
index d0ecaf9..61f93ea 100644
--- a/drivers/usb/storage/unusual_devs.h
+++ b/drivers/usb/storage/unusual_devs.h
@@ -2019,6 +2019,18 @@ UNUSUAL_DEV( 0x1908, 0x3335, 0x0200, 0x0200,
USB_SC_DEVICE, USB_PR_DEVICE, NULL,
US_FL_NO_READ_DISC_INFO ),

+/* Reported by Oliver Neukum <[email protected]>
+ * This device morphes spontaneously into another device if the access
+ * pattern of Windows isn't followed. Thus writable media would be dirty
+ * if the initial instance is used. So the device is limited to its
+ * virtual CD.
+ * And yes, the concept that BCD goes up to 9 is not heeded */
+UNUSUAL_DEV( 0x19d2, 0x1225, 0x0000, 0xffff,
+ "ZTE,Incorporated",
+ "ZTE WCDMA Technologies MSM",
+ USB_SC_DEVICE, USB_PR_DEVICE, NULL,
+ US_FL_SINGLE_LUN ),
+
/* Reported by Sven Geggus <[email protected]>
* This encrypted pen drive returns bogus data for the initial READ(10).
*/
--
1.9.1

2016-03-16 08:35:52

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 038/107] xhci: Calculate old endpoints correctly on device reset

From: Brian Campbell <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 326124a027abc9a7f43f72dc94f6f0f7a55b02b3 upstream.

When resetting a device the number of active TTs may need to be
corrected by xhci_update_tt_active_eps, but the number of old active
endpoints supplied to it was always zero, so the number of TTs and the
bandwidth reserved for them was not updated, and could rise
unnecessarily.

This affected systems using Intel's Patherpoint chipset, which rely on
software bandwidth checking. For example, a Lenovo X230 would lose the
ability to use ports on the docking station after enough suspend/resume
cycles because the bandwidth calculated would rise with every cycle when
a suitable device is attached.

The correct number of active endpoints is calculated in the same way as
in xhci_reserve_bandwidth.

Signed-off-by: Brian Campbell <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/host/xhci.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index d96652d..fd52e1e 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -3368,6 +3368,9 @@ int xhci_discover_or_reset_device(struct usb_hcd *hcd, struct usb_device *udev)
return -EINVAL;
}

+ if (virt_dev->tt_info)
+ old_active_eps = virt_dev->tt_info->active_eps;
+
if (virt_dev->udev != udev) {
/* If the virt_dev and the udev does not match, this virt_dev
* may belong to another udev.
--
1.9.1

2016-03-16 08:36:22

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 034/107] md: make sure everything is freed when dm-raid stops an array.

From: NeilBrown <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 5eff3c439d3478ba9e8ba5f8c0aaf6e6fadb6e58 upstream.

md_stop() would stop an array, but not free various attached
data structures.
For internal arrays, these are freed later in do_md_stop() or
mddev_put(), but they don't apply for dm-raid arrays.
So get md_stop() to free them, and only all it from dm-raid.
For internal arrays we now call __md_stop.

Reported-by: majianpeng <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/md.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 83dba06..25f0cb5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5120,7 +5120,7 @@ void md_stop_writes(struct mddev *mddev)
}
EXPORT_SYMBOL_GPL(md_stop_writes);

-void md_stop(struct mddev *mddev)
+static void __md_stop(struct mddev *mddev)
{
mddev->ready = 0;
mddev->pers->stop(mddev);
@@ -5130,6 +5130,18 @@ void md_stop(struct mddev *mddev)
mddev->pers = NULL;
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
}
+
+void md_stop(struct mddev *mddev)
+{
+ /* stop the array and free an attached data structures.
+ * This is called from dm-raid
+ */
+ __md_stop(mddev);
+ bitmap_destroy(mddev);
+ if (mddev->bio_set)
+ bioset_free(mddev->bio_set);
+}
+
EXPORT_SYMBOL_GPL(md_stop);

static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
@@ -5190,7 +5202,7 @@ static int do_md_stop(struct mddev * mddev, int mode,
set_disk_ro(disk, 0);

__md_stop_writes(mddev);
- md_stop(mddev);
+ __md_stop(mddev);
mddev->queue->merge_bvec_fn = NULL;
mddev->queue->backing_dev_info.congested_fn = NULL;

--
1.9.1

2016-03-16 08:36:19

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 036/107] usb: dwc3: Reset the transfer resource index on SET_INTERFACE

From: John Youn <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit aebda618718157a69c0dc0adb978d69bc2b8723c upstream.

This fixes an issue introduced in commit b23c843992b6 (usb: dwc3:
gadget: fix DEPSTARTCFG for non-EP0 EPs) that made sure we would
only use DEPSTARTCFG once per SetConfig.

The trick is that we should use one DEPSTARTCFG per SetConfig *OR*
SetInterface. SetInterface was completely missed from the original
patch.

This problem became aparent after commit 76e838c9f776 (usb: dwc3:
gadget: return error if command sent to DEPCMD register fails)
added checking of the return status of device endpoint commands.

'Set Endpoint Transfer Resource' command was caught failing
occasionally. This is because the Transfer Resource
Index was not getting reset during a SET_INTERFACE request.

Finally, to fix the issue, was we have to do is make sure that
our start_config_issued flag gets reset whenever we receive a
SetInterface request.

To verify the problem (and its fix), all we have to do is run
test 9 from testusb with 'testusb -t 9 -s 2048 -a -c 5000'.

Tested-by: Huang Rui <[email protected]>
Tested-by: Subbaraya Sundeep Bhatta <[email protected]>
Fixes: b23c843992b6 (usb: dwc3: gadget: fix DEPSTARTCFG for non-EP0 EPs)
Signed-off-by: John Youn <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
[lizf: Backported to 3.4: use dev_vdbg() instead of dwc3_trace()]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/dwc3/ep0.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
index a8714fd..1d55451 100644
--- a/drivers/usb/dwc3/ep0.c
+++ b/drivers/usb/dwc3/ep0.c
@@ -496,6 +496,10 @@ static int dwc3_ep0_std_request(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl)
dev_vdbg(dwc->dev, "USB_REQ_SET_CONFIGURATION\n");
ret = dwc3_ep0_set_config(dwc, ctrl);
break;
+ case USB_REQ_SET_INTERFACE:
+ dev_vdbg(dwc->dev ,"USB_REQ_SET_INTERFACE");
+ dwc->start_config_issued = false;
+ /* Fall through */
default:
dev_vdbg(dwc->dev, "Forwarding to gadget driver\n");
ret = dwc3_ep0_delegate_req(dwc, ctrl);
--
1.9.1

2016-03-16 08:36:18

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 037/107] usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function

From: AMAN DEEP <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 3496810663922617d4b706ef2780c279252ddd6a upstream.

virt_dev->num_cached_rings counts on freed ring and is not updated
correctly. In xhci_free_or_cache_endpoint_ring() function, the free ring
is added into cache and then num_rings_cache is incremented as below:
virt_dev->ring_cache[rings_cached] =
virt_dev->eps[ep_index].ring;
virt_dev->num_rings_cached++;
here, free ring pointer is added to a current index and then
index is incremented.
So current index always points to empty location in the ring cache.
For getting available free ring, current index should be decremented
first and then corresponding ring buffer value should be taken from ring
cache.

But In function xhci_endpoint_init(), the num_rings_cached index is
accessed before decrement.
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;
virt_dev->num_rings_cached--;
This is bug in manipulating the index of ring cache.
And it should be as below:
virt_dev->num_rings_cached--;
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;

Signed-off-by: Aman Deep <[email protected]>
Signed-off-by: Mathias Nyman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/host/xhci-mem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index aa38b1f..048cc38 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1420,10 +1420,10 @@ int xhci_endpoint_init(struct xhci_hcd *xhci,
/* Attempt to use the ring cache */
if (virt_dev->num_rings_cached == 0)
return -ENOMEM;
+ virt_dev->num_rings_cached--;
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;
- virt_dev->num_rings_cached--;
xhci_reinit_cached_ring(xhci, virt_dev->eps[ep_index].new_ring,
1, type);
}
--
1.9.1

2016-03-16 08:37:10

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 033/107] inet: frags: fix defragmented packet's IP header for af_packet

From: Edward Hyunkoo Jee <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 0848f6428ba3a2e42db124d41ac6f548655735bf upstream.

When ip_frag_queue() computes positions, it assumes that the passed
sk_buff does not contain L2 headers.

However, when PACKET_FANOUT_FLAG_DEFRAG is used, IP reassembly
functions can be called on outgoing packets that contain L2 headers.

Also, IPv4 checksum is not corrected after reassembly.

Fixes: 7736d33f4262 ("packet: Add pre-defragmentation support for ipv4 fanouts.")
Signed-off-by: Edward Hyunkoo Jee <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Cc: Jerry Chu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/ipv4/ip_fragment.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 4a40457..f459793 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -384,7 +384,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
ihl = ip_hdrlen(skb);

/* Determine the position of this fragment. */
- end = offset + skb->len - ihl;
+ end = offset + skb->len - skb_network_offset(skb) - ihl;
err = -EINVAL;

/* Is this the final fragment? */
@@ -414,7 +414,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
goto err;

err = -ENOMEM;
- if (pskb_pull(skb, ihl) == NULL)
+ if (!pskb_pull(skb, skb_network_offset(skb) + ihl))
goto err;

err = pskb_trim_rcsum(skb, end - offset);
@@ -637,6 +637,8 @@ static int ip_frag_reasm(struct ipq *qp, struct sk_buff *prev,
iph->frag_off = 0;
iph->tot_len = htons(len);
iph->tos |= ecn;
+ ip_send_check(iph);
+
IP_INC_STATS_BH(net, IPSTATS_MIB_REASMOKS);
qp->q.fragments = NULL;
qp->q.fragments_tail = NULL;
--
1.9.1

2016-03-16 08:37:43

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 032/107] mac80211: clear subdir_stations when removing debugfs

From: Tom Hughes <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4479004e6409087d1b4986881dc98c6c15dffb28 upstream.

If we don't do this, and we then fail to recreate the debugfs
directory during a mode change, then we will fail later trying
to add stations to this now bogus directory:

BUG: unable to handle kernel NULL pointer dereference at 0000006c
IP: [<c0a92202>] mutex_lock+0x12/0x30
Call Trace:
[<c0678ab4>] start_creating+0x44/0xc0
[<c0679203>] debugfs_create_dir+0x13/0xf0
[<f8a938ae>] ieee80211_sta_debugfs_add+0x6e/0x490 [mac80211]

Signed-off-by: Tom Hughes <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
net/mac80211/debugfs_netdev.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/mac80211/debugfs_netdev.c b/net/mac80211/debugfs_netdev.c
index d5404cc..fa4a689 100644
--- a/net/mac80211/debugfs_netdev.c
+++ b/net/mac80211/debugfs_netdev.c
@@ -700,6 +700,7 @@ void ieee80211_debugfs_remove_netdev(struct ieee80211_sub_if_data *sdata)

debugfs_remove_recursive(sdata->debugfs.dir);
sdata->debugfs.dir = NULL;
+ sdata->debugfs.subdir_stations = NULL;
}

void ieee80211_debugfs_rename_netdev(struct ieee80211_sub_if_data *sdata)
--
1.9.1

2016-03-16 08:38:08

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 031/107] can: mcp251x: fix resume when device is down

From: Stefan Agner <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 25b401c1816ae64bcc5dcb1d39ab41812522a0ce upstream.

If a valid power regulator or a dummy regulator is used (which
happens to be the case when no regulator is specified), restart_work
is queued no matter whether the device was running or not at suspend
time. Since work queues get initialized in the ndo_open callback,
resuming leads to a NULL pointer exception.

Reverse exactly the steps executed at suspend time:
- Enable the power regulator in any case
- Enable the transceiver regulator if the device was running, even in
case we have a power regulator
- Queue restart_work only in case the device was running

Fixes: bf66f3736a94 ("can: mcp251x: Move to threaded interrupts instead of workqueues.")
Signed-off-by: Stefan Agner <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
[lizf: Backported to 3.4:
- adjust filename
- adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
drivers/net/can/mcp251x.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/can/mcp251x.c b/drivers/net/can/mcp251x.c
index 9d60742..d07426d 100644
--- a/drivers/net/can/mcp251x.c
+++ b/drivers/net/can/mcp251x.c
@@ -1161,18 +1161,17 @@ static int mcp251x_can_resume(struct spi_device *spi)
struct mcp251x_platform_data *pdata = spi->dev.platform_data;
struct mcp251x_priv *priv = dev_get_drvdata(&spi->dev);

- if (priv->after_suspend & AFTER_SUSPEND_POWER) {
+ if (priv->after_suspend & AFTER_SUSPEND_POWER)
pdata->power_enable(1);
+
+ if (priv->after_suspend & AFTER_SUSPEND_UP) {
+ if (pdata->transceiver_enable)
+ pdata->transceiver_enable(1);
queue_work(priv->wq, &priv->restart_work);
} else {
- if (priv->after_suspend & AFTER_SUSPEND_UP) {
- if (pdata->transceiver_enable)
- pdata->transceiver_enable(1);
- queue_work(priv->wq, &priv->restart_work);
- } else {
- priv->after_suspend = 0;
- }
+ priv->after_suspend = 0;
}
+
priv->force_quit = 0;
enable_irq(spi->irq);
return 0;
--
1.9.1

2016-03-16 08:38:34

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 028/107] libata: increase the timeout when setting transfer mode

From: Mikulas Patocka <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit d531be2ca2f27cca5f041b6a140504999144a617 upstream.

I have a ST4000DM000 disk. If Linux is booted while the disk is spun down,
the command that sets transfer mode causes the disk to spin up. The
spin-up takes longer than the default 5s timeout, so the command fails and
timeout is reported.

Fix this by increasing the timeout to 15s, which is enough for the disk to
spin up.

Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/ata/libata-core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 5a3b08b..68bdd59 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4444,7 +4444,8 @@ static unsigned int ata_dev_set_xfermode(struct ata_device *dev)
else /* In the ancient relic department - skip all of this */
return 0;

- err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 0);
+ /* On some disks, this command causes spin-up, so we need longer timeout */
+ err_mask = ata_exec_internal(dev, &tf, NULL, DMA_NONE, NULL, 0, 15000);

DPRINTK("EXIT, err_mask=%x\n", err_mask);
return err_mask;
--
1.9.1

2016-03-16 08:38:55

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 027/107] libata: force disable trim for SuperSSpeed S238

From: Arne Fitzenreiter <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit cda57b1b05cf7b8b99ab4b732bea0b05b6c015cc upstream.

This device loses blocks, often the partition table area, on trim.
Disable TRIM.
http://pcengines.ch/msata16a.htm

Signed-off-by: Arne Fitzenreiter <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/ata/libata-core.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index b5d532f..5a3b08b 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4144,6 +4144,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
{ "WD My Book", NULL, ATA_HORKAGE_1_5_GBPS, },
{ "Seagate FreeAgent GoFlex", NULL, ATA_HORKAGE_1_5_GBPS, },

+ /* devices that don't properly handle TRIM commands */
+ { "SuperSSpeed S238*", NULL, ATA_HORKAGE_NOTRIM, },
+
/*
* Devices which choke on SETXFER. Applies only if both the
* device and controller are SATA.
--
1.9.1

2016-03-16 08:39:16

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 025/107] libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER

From: Aleksei Mamlin <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 08c85d2a599d967ede38a847f5594447b6100642 upstream.

Enabling AA on HP 250GB SATA disk VB0250EAVER causes errors:

[ 3.788362] ata3.00: failed to enable AA (error_mask=0x1)
[ 3.789243] ata3.00: failed to enable AA (error_mask=0x1)

Add the ATA_HORKAGE_BROKEN_FPDMA_AA for this specific harddisk.

tj: Collected FPDMA_AA entries and updated comment.

Signed-off-by: Aleksei Mamlin <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/ata/libata-core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0a6767b..b5d532f 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4104,9 +4104,10 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
{ "ST3320[68]13AS", "SD1[5-9]", ATA_HORKAGE_NONCQ |
ATA_HORKAGE_FIRMWARE_WARN },

- /* Seagate Momentus SpinPoint M8 seem to have FPMDA_AA issues */
+ /* drives which fail FPDMA_AA activation (some may freeze afterwards) */
{ "ST1000LM024 HN-M101MBB", "2AR10001", ATA_HORKAGE_BROKEN_FPDMA_AA },
{ "ST1000LM024 HN-M101MBB", "2BA30001", ATA_HORKAGE_BROKEN_FPDMA_AA },
+ { "VB0250EAVER", "HPG7", ATA_HORKAGE_BROKEN_FPDMA_AA },

/* Blacklist entries taken from Silicon Image 3124/3132
Windows driver .inf file - also several Linux problem reports */
--
1.9.1

2016-03-16 08:08:29

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 021/107] s390/process: fix sfpc inline assembly

From: Heiko Carstens <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit e47994dd44bcb4a77b4152bd0eada585934703c0 upstream.

The sfpc inline assembly within execve_tail() may incorrectly set bits
28-31 of the sfpc instruction to a value which is not zero.
These bits however are currently unused and therefore should be zero
so we won't get surprised if these bits will be used in the future.

Therefore remove the second operand from the inline assembly.

Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
arch/s390/kernel/process.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index 60055ce..5f947e5 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -246,7 +246,7 @@ asmlinkage void execve_tail(void)
{
current->thread.fp_regs.fpc = 0;
if (MACHINE_HAS_IEEE)
- asm volatile("sfpc %0,%0" : : "d" (0));
+ asm volatile("sfpc %0" : : "d" (0));
}

/*
--
1.9.1

2016-03-16 08:39:36

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 024/107] ata: pmp: add quirk for Marvell 4140 SATA PMP

From: Lior Amsalem <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 945b47441d83d2392ac9f984e0267ad521f24268 upstream.

This commit adds the necessary quirk to make the Marvell 4140 SATA PMP
work properly. This PMP doesn't like SRST on port number 4 (the host
port) so this commit marks this port as not supporting SRST.

Signed-off-by: Lior Amsalem <[email protected]>
Reviewed-by: Nadav Haklai <[email protected]>
Signed-off-by: Thomas Petazzoni <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/ata/libata-pmp.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/ata/libata-pmp.c b/drivers/ata/libata-pmp.c
index 0ba32fe..93ea335 100644
--- a/drivers/ata/libata-pmp.c
+++ b/drivers/ata/libata-pmp.c
@@ -460,6 +460,13 @@ static void sata_pmp_quirks(struct ata_port *ap)
ATA_LFLAG_NO_SRST |
ATA_LFLAG_ASSUME_ATA;
}
+ } else if (vendor == 0x11ab && devid == 0x4140) {
+ /* Marvell 4140 quirks */
+ ata_for_each_link(link, ap, EDGE) {
+ /* port 4 is for SEMB device and it doesn't like SRST */
+ if (link->pmp == 4)
+ link->flags |= ATA_LFLAG_DISABLED;
+ }
}
}

--
1.9.1

2016-03-16 08:40:25

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 023/107] st: null pointer dereference panic caused by use after kref_put by st_open

From: "Seymour, Shane M" <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit e7ac6c6666bec0a354758a1298d3231e4a635362 upstream.

Two SLES11 SP3 servers encountered similar crashes simultaneously
following some kind of SAN/tape target issue:

...
qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002.
qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 -- 1 2002.
qla2xxx [0000:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
qla2xxx [0000:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2.
qla2xxx [0000:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2.
qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [0000:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2.
qla2xxx [0000:81:00.0]-00af:3: Performing ISP error recovery - ha=ffff88bf04d18000.
rport-3:0-0: blocked FC remote port time out: removing target and saving binding
qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
qla2xxx [0000:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2.
rport-2:0-0: blocked FC remote port time out: removing target and saving binding
sg_rq_end_io: device detached
BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
IP: [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP
CPU 0
...
Supported: No, Proprietary modules are loaded [1739975.390463]
Pid: 27965, comm: ABCD Tainted: PF X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8
RIP: 0010:[<ffffffff8133b268>] [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
RSP: 0018:ffff8839dc1e7c68 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff883f0592fc00 RCX: 0000000000000090
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000138
RBP: 0000000000000138 R08: 0000000000000010 R09: ffffffff81bd39d0
R10: 00000000000009c0 R11: ffffffff81025790 R12: 0000000000000001
R13: ffff883022212b80 R14: 0000000000000004 R15: ffff883022212b80
FS: 00007f8e54560720(0000) GS:ffff88407f800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000002a8 CR3: 0000007e6ced6000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ABCD (pid: 27965, threadinfo ffff8839dc1e6000, task ffff883592e0c640)
Stack:
ffff883f0592fc00 00000000fffffffa 0000000000000001 ffff883022212b80
ffff883eff772400 ffffffffa03fa309 0000000000000000 0000000000000000
ffffffffa04003a0 ffff883f063196c0 ffff887f0379a930 ffffffff8115ea1e
Call Trace:
[<ffffffffa03fa309>] st_open+0x129/0x240 [st]
[<ffffffff8115ea1e>] chrdev_open+0x13e/0x200
[<ffffffff811588a8>] __dentry_open+0x198/0x310
[<ffffffff81167d74>] do_last+0x1f4/0x800
[<ffffffff81168fe9>] path_openat+0xd9/0x420
[<ffffffff8116946c>] do_filp_open+0x4c/0xc0
[<ffffffff8115a00f>] do_sys_open+0x17f/0x250
[<ffffffff81468d92>] system_call_fastpath+0x16/0x1b
[<00007f8e4f617fd0>] 0x7f8e4f617fcf
Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 <f0> ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0
RIP [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
RSP <ffff8839dc1e7c68>
CR2: 00000000000002a8

Analysis reveals the cause of the crash to be due to STp->device
being NULL. The pointer was NULLed via scsi_tape_put(STp) when it
calls scsi_tape_release(). In st_open() we jump to err_out after
scsi_block_when_processing_errors() completes and returns the
device as offline (sdev_state was SDEV_DEL):

1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host
1181 module count. */
1182 static int st_open(struct inode *inode, struct file *filp)
1183 {
1184 int i, retval = (-EIO);
1185 int resumed = 0;
1186 struct scsi_tape *STp;
1187 struct st_partstat *STps;
1188 int dev = TAPE_NR(inode);
1189 char *name;
...
1217 if (scsi_autopm_get_device(STp->device) < 0) {
1218 retval = -EIO;
1219 goto err_out;
1220 }
1221 resumed = 1;
1222 if (!scsi_block_when_processing_errors(STp->device)) {
1223 retval = (-ENXIO);
1224 goto err_out;
1225 }
...
1264 err_out:
1265 normalize_buffer(STp->buffer);
1266 spin_lock(&st_use_lock);
1267 STp->in_use = 0;
1268 spin_unlock(&st_use_lock);
1269 scsi_tape_put(STp); <-- STp->device = 0 after this
1270 if (resumed)
1271 scsi_autopm_put_device(STp->device);
1272 return retval;

The ref count for the struct scsi_tape had already been reduced
to 1 when the .remove method of the st module had been called.
The kref_put() in scsi_tape_put() caused scsi_tape_release()
to be called:

0266 static void scsi_tape_put(struct scsi_tape *STp)
0267 {
0268 struct scsi_device *sdev = STp->device;
0269
0270 mutex_lock(&st_ref_mutex);
0271 kref_put(&STp->kref, scsi_tape_release); <-- calls this
0272 scsi_device_put(sdev);
0273 mutex_unlock(&st_ref_mutex);
0274 }

In scsi_tape_release() the struct scsi_device in the struct
scsi_tape gets set to NULL:

4273 static void scsi_tape_release(struct kref *kref)
4274 {
4275 struct scsi_tape *tpnt = to_scsi_tape(kref);
4276 struct gendisk *disk = tpnt->disk;
4277
4278 tpnt->device = NULL; <<<---- where the dev is nulled
4279
4280 if (tpnt->buffer) {
4281 normalize_buffer(tpnt->buffer);
4282 kfree(tpnt->buffer->reserved_pages);
4283 kfree(tpnt->buffer);
4284 }
4285
4286 disk->private_data = NULL;
4287 put_disk(disk);
4288 kfree(tpnt);
4289 return;
4290 }

Although the problem was reported on SLES11.3 the problem appears
in linux-next as well.

The crash is fixed by reordering the code so we no longer access
the struct scsi_tape after the kref_put() is done on it in st_open().

Signed-off-by: Shane Seymour <[email protected]>
Signed-off-by: Darren Lavender <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Acked-by: Kai Mäkisara <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/scsi/st.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index e41998c..a3eb263 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1268,9 +1268,9 @@ static int st_open(struct inode *inode, struct file *filp)
err_out:
normalize_buffer(STp->buffer);
STp->in_use = 0;
- scsi_tape_put(STp);
if (resumed)
scsi_autopm_put_device(STp->device);
+ scsi_tape_put(STp);
mutex_unlock(&st_mutex);
return retval;

--
1.9.1

2016-03-16 08:40:50

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 022/107] rds: rds_ib_device.refcount overflow

From: Wengang Wang <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4fabb59449aa44a585b3603ffdadd4c5f4d0c033 upstream.

Fixes: 3e0249f9c05c ("RDS/IB: add refcount tracking to struct rds_ib_device")

There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
failed(mr pool running out). this lead to the refcount overflow.

A complain in line 117(see following) is seen. From vmcore:
s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
to return ERR_PTR(-EAGAIN).

115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
116 {
117 BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
118 if (atomic_dec_and_test(&rds_ibdev->refcount))
119 queue_work(rds_wq, &rds_ibdev->free_work);
120 }

fix is to drop refcount when rds_ib_alloc_fmr failed.

Signed-off-by: Wengang Wang <[email protected]>
Reviewed-by: Haggai Eran <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/rds/ib_rdma.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index e8fdb17..a985158 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -759,8 +759,10 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
}

ibmr = rds_ib_alloc_fmr(rds_ibdev);
- if (IS_ERR(ibmr))
+ if (IS_ERR(ibmr)) {
+ rds_ib_dev_put(rds_ibdev);
return ibmr;
+ }

ret = rds_ib_map_fmr(rds_ibdev, ibmr, sg, nents);
if (ret == 0)
--
1.9.1

2016-03-16 08:41:25

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 018/107] rtnetlink: verify IFLA_VF_INFO attributes before passing them to driver

From: Daniel Borkmann <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4f7d2cdfdde71ffe962399b7020c674050329423 upstream.

Jason Gunthorpe reported that since commit c02db8c6290b ("rtnetlink: make
SR-IOV VF interface symmetric"), we don't verify IFLA_VF_INFO attributes
anymore with respect to their policy, that is, ifla_vfinfo_policy[].

Before, they were part of ifla_policy[], but they have been nested since
placed under IFLA_VFINFO_LIST, that contains the attribute IFLA_VF_INFO,
which is another nested attribute for the actual VF attributes such as
IFLA_VF_MAC, IFLA_VF_VLAN, etc.

Despite the policy being split out from ifla_policy[] in this commit,
it's never applied anywhere. nla_for_each_nested() only does basic nla_ok()
testing for struct nlattr, but it doesn't know about the data context and
their requirements.

Fix, on top of Jason's initial work, does 1) parsing of the attributes
with the right policy, and 2) using the resulting parsed attribute table
from 1) instead of the nla_for_each_nested() loop (just like we used to
do when still part of ifla_policy[]).

Reference: http://thread.gmane.org/gmane.linux.network/368913
Fixes: c02db8c6290b ("rtnetlink: make SR-IOV VF interface symmetric")
Reported-by: Jason Gunthorpe <[email protected]>
Cc: Chris Wright <[email protected]>
Cc: Sucheta Chakraborty <[email protected]>
Cc: Greg Rose <[email protected]>
Cc: Jeff Kirsher <[email protected]>
Cc: Rony Efraim <[email protected]>
Cc: Vlad Zolotarov <[email protected]>
Cc: Nicolas Dichtel <[email protected]>
Cc: Thomas Graf <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Vlad Zolotarov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
[bwh: Backported to 3.2:
- Drop unsupported attributes
- Use ndo_set_vf_tx_rate operation, not ndo_set_vf_rate]
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
net/core/rtnetlink.c | 106 +++++++++++++++++++++++++--------------------------
1 file changed, 52 insertions(+), 54 deletions(-)

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 43c6dd8..8941e962 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1139,10 +1139,6 @@ static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
[IFLA_INFO_DATA] = { .type = NLA_NESTED },
};

-static const struct nla_policy ifla_vfinfo_policy[IFLA_VF_INFO_MAX+1] = {
- [IFLA_VF_INFO] = { .type = NLA_NESTED },
-};
-
static const struct nla_policy ifla_vf_policy[IFLA_VF_MAX+1] = {
[IFLA_VF_MAC] = { .len = sizeof(struct ifla_vf_mac) },
[IFLA_VF_VLAN] = { .len = sizeof(struct ifla_vf_vlan) },
@@ -1216,58 +1212,53 @@ static int validate_linkmsg(struct net_device *dev, struct nlattr *tb[])
return 0;
}

-static int do_setvfinfo(struct net_device *dev, struct nlattr *attr)
+static int do_setvfinfo(struct net_device *dev, struct nlattr **tb)
{
- int rem, err = -EINVAL;
- struct nlattr *vf;
const struct net_device_ops *ops = dev->netdev_ops;
+ int err = -EINVAL;

- nla_for_each_nested(vf, attr, rem) {
- switch (nla_type(vf)) {
- case IFLA_VF_MAC: {
- struct ifla_vf_mac *ivm;
- ivm = nla_data(vf);
- err = -EOPNOTSUPP;
- if (ops->ndo_set_vf_mac)
- err = ops->ndo_set_vf_mac(dev, ivm->vf,
- ivm->mac);
- break;
- }
- case IFLA_VF_VLAN: {
- struct ifla_vf_vlan *ivv;
- ivv = nla_data(vf);
- err = -EOPNOTSUPP;
- if (ops->ndo_set_vf_vlan)
- err = ops->ndo_set_vf_vlan(dev, ivv->vf,
- ivv->vlan,
- ivv->qos);
- break;
- }
- case IFLA_VF_TX_RATE: {
- struct ifla_vf_tx_rate *ivt;
- ivt = nla_data(vf);
- err = -EOPNOTSUPP;
- if (ops->ndo_set_vf_tx_rate)
- err = ops->ndo_set_vf_tx_rate(dev, ivt->vf,
- ivt->rate);
- break;
- }
- case IFLA_VF_SPOOFCHK: {
- struct ifla_vf_spoofchk *ivs;
- ivs = nla_data(vf);
- err = -EOPNOTSUPP;
- if (ops->ndo_set_vf_spoofchk)
- err = ops->ndo_set_vf_spoofchk(dev, ivs->vf,
- ivs->setting);
- break;
- }
- default:
- err = -EINVAL;
- break;
- }
- if (err)
- break;
+ if (tb[IFLA_VF_MAC]) {
+ struct ifla_vf_mac *ivm = nla_data(tb[IFLA_VF_MAC]);
+ err = -EOPNOTSUPP;
+ if (ops->ndo_set_vf_mac)
+ err = ops->ndo_set_vf_mac(dev, ivm->vf,
+ ivm->mac);
+ if (err < 0)
+ return err;
}
+
+ if (tb[IFLA_VF_VLAN]) {
+ struct ifla_vf_vlan *ivv = nla_data(tb[IFLA_VF_VLAN]);
+
+ err = -EOPNOTSUPP;
+ if (ops->ndo_set_vf_vlan)
+ err = ops->ndo_set_vf_vlan(dev, ivv->vf, ivv->vlan,
+ ivv->qos);
+ if (err < 0)
+ return err;
+ }
+
+ if (tb[IFLA_VF_TX_RATE]) {
+ struct ifla_vf_tx_rate *ivt = nla_data(tb[IFLA_VF_TX_RATE]);
+
+ if (ops->ndo_set_vf_tx_rate)
+ err = ops->ndo_set_vf_tx_rate(dev, ivt->vf,
+ ivt->rate);
+ if (err < 0)
+ return err;
+ }
+
+ if (tb[IFLA_VF_SPOOFCHK]) {
+ struct ifla_vf_spoofchk *ivs = nla_data(tb[IFLA_VF_SPOOFCHK]);
+
+ err = -EOPNOTSUPP;
+ if (ops->ndo_set_vf_spoofchk)
+ err = ops->ndo_set_vf_spoofchk(dev, ivs->vf,
+ ivs->setting);
+ if (err < 0)
+ return err;
+ }
+
return err;
}

@@ -1450,14 +1441,21 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm,
}

if (tb[IFLA_VFINFO_LIST]) {
+ struct nlattr *vfinfo[IFLA_VF_MAX + 1];
struct nlattr *attr;
int rem;
+
nla_for_each_nested(attr, tb[IFLA_VFINFO_LIST], rem) {
- if (nla_type(attr) != IFLA_VF_INFO) {
+ if (nla_type(attr) != IFLA_VF_INFO ||
+ nla_len(attr) < NLA_HDRLEN) {
err = -EINVAL;
goto errout;
}
- err = do_setvfinfo(dev, attr);
+ err = nla_parse_nested(vfinfo, IFLA_VF_MAX, attr,
+ ifla_vf_policy);
+ if (err < 0)
+ goto errout;
+ err = do_setvfinfo(dev, vfinfo);
if (err < 0)
goto errout;
modified = 1;
--
1.9.1

2016-03-16 08:41:48

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 016/107] s390/sclp: clear upper register halves in _sclp_print_early

From: Martin Schwidefsky <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit f9c87a6f46d508eae0d9ae640be98d50f237f827 upstream.

If the kernel is compiled with gcc 5.1 and the XZ compression option
the decompress_kernel function calls _sclp_print_early in 64-bit mode
while the content of the upper register half of %r6 is non-zero.
This causes a specification exception on the servc instruction in
_sclp_servc.

The _sclp_print_early function saves and restores the upper registers
halves but it fails to clear them for the 31-bit code of the mini sclp
driver.

Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
arch/s390/kernel/sclp.S | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/s390/kernel/sclp.S b/arch/s390/kernel/sclp.S
index 95792d8..51ca1c3 100644
--- a/arch/s390/kernel/sclp.S
+++ b/arch/s390/kernel/sclp.S
@@ -270,6 +270,8 @@ ENTRY(_sclp_print_early)
jno .Lesa2
ahi %r15,-80
stmh %r6,%r15,96(%r15) # store upper register halves
+ basr %r13,0
+ lmh %r0,%r15,.Lzeroes-.(%r13) # clear upper register halves
.Lesa2:
#endif
lr %r10,%r2 # save string pointer
@@ -293,6 +295,8 @@ ENTRY(_sclp_print_early)
#endif
lm %r6,%r15,120(%r15) # restore registers
br %r14
+.Lzeroes:
+ .fill 64,4,0

.LwritedataS4:
.long 0x00760005 # SCLP command for write data
--
1.9.1

2016-03-16 08:42:22

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 014/107] USB: cp210x: add ID for Aruba Networks controllers

From: Peter Sanford <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit f98a7aa81eeeadcad25665c3501c236d531d4382 upstream.

Add the USB serial console device ID for Aruba Networks 7xxx series
controllers which have a USB port for their serial console.

Signed-off-by: Peter Sanford <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/serial/cp210x.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index 29bf383..7a04e2c 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -193,6 +193,7 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(0x1FB9, 0x0602) }, /* Lake Shore Model 648 Magnet Power Supply */
{ USB_DEVICE(0x1FB9, 0x0700) }, /* Lake Shore Model 737 VSM Controller */
{ USB_DEVICE(0x1FB9, 0x0701) }, /* Lake Shore Model 776 Hall Matrix */
+ { USB_DEVICE(0x2626, 0xEA60) }, /* Aruba Networks 7xxx USB Serial Console */
{ USB_DEVICE(0x3195, 0xF190) }, /* Link Instruments MSO-19 */
{ USB_DEVICE(0x3195, 0xF280) }, /* Link Instruments MSO-28 */
{ USB_DEVICE(0x3195, 0xF281) }, /* Link Instruments MSO-28 */
--
1.9.1

2016-03-16 08:43:29

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 011/107] dm btree remove: fix bug in redistribute3

From: Dennis Yang <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 4c7e309340ff85072e96f529582d159002c36734 upstream.

redistribute3() shares entries out across 3 nodes. Some entries were
being moved the wrong way, breaking the ordering. This manifested as a
BUG() in dm-btree-remove.c:shift() when entries were removed from the
btree.

For additional context see:
https://www.redhat.com/archives/dm-devel/2015-May/msg00113.html

Signed-off-by: Dennis Yang <[email protected]>
Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/md/persistent-data/dm-btree-remove.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/persistent-data/dm-btree-remove.c b/drivers/md/persistent-data/dm-btree-remove.c
index b88757c..a03178e 100644
--- a/drivers/md/persistent-data/dm-btree-remove.c
+++ b/drivers/md/persistent-data/dm-btree-remove.c
@@ -309,8 +309,8 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,

if (s < 0 && nr_center < -s) {
/* not enough in central node */
- shift(left, center, nr_center);
- s = nr_center - target;
+ shift(left, center, -nr_center);
+ s += nr_center;
shift(left, right, s);
nr_right += s;
} else
@@ -323,7 +323,7 @@ static void redistribute3(struct dm_btree_info *info, struct btree_node *parent,
if (s > 0 && nr_center < s) {
/* not enough in central node */
shift(center, right, nr_center);
- s = target - nr_center;
+ s -= nr_center;
shift(left, right, s);
nr_left -= s;
} else
--
1.9.1

2016-03-16 08:43:26

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 013/107] USB: option: add 2020:4000 ID

From: Claudio Cappelli <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit f6d7fb37f92622479ef6da604f27561f5045ba1e upstream.

Add device Olivetti Olicard 300 (Network Connect: MT6225) - IDs 2020:4000.

T: Bus=01 Lev=02 Prnt=04 Port=00 Cnt=01 Dev#= 10 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=2020 ProdID=4000 Rev=03.00
S: Manufacturer=Network Connect
S: Product=MT6225
C: #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=02 Prot=01 Driver=option
I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

Signed-off-by: Claudio Cappelli <[email protected]>
Suggested-by: Lars Melin <[email protected]>
[johan: amend commit message with devices info ]
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
drivers/usb/serial/option.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c
index d8232df..cb999af 100644
--- a/drivers/usb/serial/option.c
+++ b/drivers/usb/serial/option.c
@@ -1757,6 +1757,7 @@ static const struct usb_device_id option_ids[] = {
{ USB_DEVICE_AND_INTERFACE_INFO(0x2001, 0x7d03, 0xff, 0x00, 0x00) },
{ USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x3e01, 0xff, 0xff, 0xff) }, /* D-Link DWM-152/C1 */
{ USB_DEVICE_AND_INTERFACE_INFO(0x07d1, 0x3e02, 0xff, 0xff, 0xff) }, /* D-Link DWM-156/C1 */
+ { USB_DEVICE_INTERFACE_CLASS(0x2020, 0x4000, 0xff) }, /* OLICARD300 - MT6225 */
{ USB_DEVICE(INOVIA_VENDOR_ID, INOVIA_SEW858) },
{ USB_DEVICE(VIATELECOM_VENDOR_ID, VIATELECOM_PRODUCT_CDS7) },
{ } /* Terminating entry */
--
1.9.1

2016-03-16 08:44:42

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 010/107] ALSA: usb-audio: Add MIDI support for Steinberg MI2/MI4

From: Dominic Sacré <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 0689a86ae814f39af94a9736a0a5426dd82eb107 upstream.

The Steinberg MI2 and MI4 interfaces are compatible with the USB class
audio spec, but the MIDI part of the devices is reported as a vendor
specific interface.

This patch adds entries to quirks-table.h to recognize the MIDI
endpoints. Audio functionality was already working and is unaffected by
this change.

Signed-off-by: Dominic Sacré <[email protected]>
Signed-off-by: Albert Huitsing <[email protected]>
Acked-by: Clemens Ladisch <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
sound/usb/quirks-table.h | 68 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)

diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h
index 2ad5d77..4cebbf7 100644
--- a/sound/usb/quirks-table.h
+++ b/sound/usb/quirks-table.h
@@ -2461,6 +2461,74 @@ YAMAHA_DEVICE(0x7010, "UB99"),
}
},

+/* Steinberg devices */
+{
+ /* Steinberg MI2 */
+ USB_DEVICE_VENDOR_SPEC(0x0a4e, 0x2040),
+ .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
+ .ifnum = QUIRK_ANY_INTERFACE,
+ .type = QUIRK_COMPOSITE,
+ .data = & (const struct snd_usb_audio_quirk[]) {
+ {
+ .ifnum = 0,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 1,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 2,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 3,
+ .type = QUIRK_MIDI_FIXED_ENDPOINT,
+ .data = &(const struct snd_usb_midi_endpoint_info) {
+ .out_cables = 0x0001,
+ .in_cables = 0x0001
+ }
+ },
+ {
+ .ifnum = -1
+ }
+ }
+ }
+},
+{
+ /* Steinberg MI4 */
+ USB_DEVICE_VENDOR_SPEC(0x0a4e, 0x4040),
+ .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
+ .ifnum = QUIRK_ANY_INTERFACE,
+ .type = QUIRK_COMPOSITE,
+ .data = & (const struct snd_usb_audio_quirk[]) {
+ {
+ .ifnum = 0,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 1,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 2,
+ .type = QUIRK_AUDIO_STANDARD_INTERFACE
+ },
+ {
+ .ifnum = 3,
+ .type = QUIRK_MIDI_FIXED_ENDPOINT,
+ .data = &(const struct snd_usb_midi_endpoint_info) {
+ .out_cables = 0x0001,
+ .in_cables = 0x0001
+ }
+ },
+ {
+ .ifnum = -1
+ }
+ }
+ }
+},
+
/* TerraTec devices */
{
USB_DEVICE_VENDOR_SPEC(0x0ccd, 0x0012),
--
1.9.1

2016-03-16 08:45:05

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 009/107] 9p: don't leave a half-initialized inode sitting around

From: Al Viro <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 0a73d0a204a4a04a1e110539c5a524ae51f91d6d upstream.

Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Zefan Li <[email protected]>
---
fs/9p/vfs_inode.c | 3 +--
fs/9p/vfs_inode_dotl.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 014c8dd..c9b32dc 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -540,8 +540,7 @@ static struct inode *v9fs_qid_iget(struct super_block *sb,
unlock_new_inode(inode);
return inode;
error:
- unlock_new_inode(inode);
- iput(inode);
+ iget_failed(inode);
return ERR_PTR(retval);

}
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index a86a78d..5cfbadd 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -169,8 +169,7 @@ static struct inode *v9fs_qid_iget_dotl(struct super_block *sb,
unlock_new_inode(inode);
return inode;
error:
- unlock_new_inode(inode);
- iput(inode);
+ iget_failed(inode);
return ERR_PTR(retval);

}
--
1.9.1

2016-03-16 08:45:23

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 007/107] hpfs: kstrdup() out of memory handling

From: Sanidhya Kashyap <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit ce657611baf902f14ae559ce4e0787ead6712067 upstream.

There is a possibility of nothing being allocated to the new_opts in
case of memory pressure, therefore return ENOMEM for such case.

Signed-off-by: Sanidhya Kashyap <[email protected]>
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
fs/hpfs/super.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index 0bf578d..0b990b2 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -385,9 +385,13 @@ static int hpfs_remount_fs(struct super_block *s, int *flags, char *data)
int o;
struct hpfs_sb_info *sbi = hpfs_sb(s);
char *new_opts = kstrdup(data, GFP_KERNEL);
-
+
+
+ if (!new_opts)
+ return -ENOMEM;
+
*flags |= MS_NOATIME;
-
+
hpfs_lock(s);
lock_super(s);
uid = sbi->sb_uid; gid = sbi->sb_gid;
--
1.9.1

2016-03-16 08:45:57

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 004/107] ext4: avoid deadlocks in the writeback path by using sb_getblk_gfp

From: Nikolay Borisov <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit c45653c341f5c8a0ce19c8f0ad4678640849cb86 upstream.

Switch ext4 to using sb_getblk_gfp with GFP_NOFS added to fix possible
deadlocks in the page writeback path.

Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
fs/ext4/extents.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index bbe09a9..e05cb4c 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -699,7 +699,8 @@ ext4_ext_find_extent(struct inode *inode, ext4_lblk_t block,
path[ppos].p_depth = i;
path[ppos].p_ext = NULL;

- bh = sb_getblk(inode->i_sb, path[ppos].p_block);
+ bh = sb_getblk_gfp(inode->i_sb, path[ppos].p_block,
+ __GFP_MOVABLE | GFP_NOFS);
if (unlikely(!bh)) {
ret = -ENOMEM;
goto err;
@@ -904,7 +905,7 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
err = -EIO;
goto cleanup;
}
- bh = sb_getblk(inode->i_sb, newblock);
+ bh = sb_getblk_gfp(inode->i_sb, newblock, __GFP_MOVABLE | GFP_NOFS);
if (!bh) {
err = -ENOMEM;
goto cleanup;
@@ -1088,7 +1089,7 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
if (newblock == 0)
return err;

- bh = sb_getblk(inode->i_sb, newblock);
+ bh = sb_getblk_gfp(inode->i_sb, newblock, __GFP_MOVABLE | GFP_NOFS);
if (!bh)
return -ENOMEM;
lock_buffer(bh);
--
1.9.1

2016-03-16 08:46:19

by lizf

[permalink] [raw]
Subject: [PATCH 3.4 002/107] fs/buffer.c: support buffer cache allocations with gfp modifiers

From: Gioh Kim <[email protected]>

3.4.111-rc1 review patch. If anyone has any objections, please let me know.

------------------


commit 3b5e6454aaf6b4439b19400d8365e2ec2d24e411 upstream.

A buffer cache is allocated from movable area because it is referred
for a while and released soon. But some filesystems are taking buffer
cache for a long time and it can disturb page migration.

New APIs are introduced to allocate buffer cache with user specific
flag. *_gfp APIs are for user want to set page allocation flag for
page cache allocation. And *_unmovable APIs are for the user wants to
allocate page cache from non-movable area.

Signed-off-by: Gioh Kim <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <[email protected]>
---
fs/buffer.c | 43 ++++++++++++++++++++++++-----------------
include/linux/buffer_head.h | 47 ++++++++++++++++++++++++++++++++++++++++-----
2 files changed, 67 insertions(+), 23 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index ed2dc70..e65a585 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -952,7 +952,7 @@ init_page_buffers(struct page *page, struct block_device *bdev,
*/
static int
grow_dev_page(struct block_device *bdev, sector_t block,
- pgoff_t index, int size, int sizebits)
+ pgoff_t index, int size, int sizebits, gfp_t gfp)
{
struct inode *inode = bdev->bd_inode;
struct page *page;
@@ -961,7 +961,7 @@ grow_dev_page(struct block_device *bdev, sector_t block,
int ret = 0; /* Will call free_more_memory() */

page = find_or_create_page(inode->i_mapping, index,
- (mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS)|__GFP_MOVABLE);
+ (mapping_gfp_mask(inode->i_mapping) & ~__GFP_FS) | gfp);
if (!page)
return ret;

@@ -1009,7 +1009,7 @@ failed:
* that page was dirty, the buffers are set dirty also.
*/
static int
-grow_buffers(struct block_device *bdev, sector_t block, int size)
+grow_buffers(struct block_device *bdev, sector_t block, int size, gfp_t gfp)
{
pgoff_t index;
int sizebits;
@@ -1036,11 +1036,12 @@ grow_buffers(struct block_device *bdev, sector_t block, int size)
}

/* Create a page with the proper size buffers.. */
- return grow_dev_page(bdev, block, index, size, sizebits);
+ return grow_dev_page(bdev, block, index, size, sizebits, gfp);
}

-static struct buffer_head *
-__getblk_slow(struct block_device *bdev, sector_t block, int size)
+struct buffer_head *
+__getblk_slow(struct block_device *bdev, sector_t block,
+ unsigned size, gfp_t gfp)
{
/* Size must be multiple of hard sectorsize */
if (unlikely(size & (bdev_logical_block_size(bdev)-1) ||
@@ -1062,13 +1063,14 @@ __getblk_slow(struct block_device *bdev, sector_t block, int size)
if (bh)
return bh;

- ret = grow_buffers(bdev, block, size);
+ ret = grow_buffers(bdev, block, size, gfp);
if (ret < 0)
return NULL;
if (ret == 0)
free_more_memory();
}
}
+EXPORT_SYMBOL(__getblk_slow);

/*
* The relationship between dirty buffers and dirty pages:
@@ -1319,24 +1321,25 @@ __find_get_block(struct block_device *bdev, sector_t block, unsigned size)
EXPORT_SYMBOL(__find_get_block);

/*
- * __getblk will locate (and, if necessary, create) the buffer_head
+ * __getblk_gfp() will locate (and, if necessary, create) the buffer_head
* which corresponds to the passed block_device, block and size. The
* returned buffer has its reference count incremented.
*
- * __getblk() will lock up the machine if grow_dev_page's try_to_free_buffers()
- * attempt is failing. FIXME, perhaps?
+ * __getblk_gfp() will lock up the machine if grow_dev_page's
+ * try_to_free_buffers() attempt is failing. FIXME, perhaps?
*/
struct buffer_head *
-__getblk(struct block_device *bdev, sector_t block, unsigned size)
+__getblk_gfp(struct block_device *bdev, sector_t block,
+ unsigned size, gfp_t gfp)
{
struct buffer_head *bh = __find_get_block(bdev, block, size);

might_sleep();
if (bh == NULL)
- bh = __getblk_slow(bdev, block, size);
+ bh = __getblk_slow(bdev, block, size, gfp);
return bh;
}
-EXPORT_SYMBOL(__getblk);
+EXPORT_SYMBOL(__getblk_gfp);

/*
* Do async read-ahead on a buffer..
@@ -1352,24 +1355,28 @@ void __breadahead(struct block_device *bdev, sector_t block, unsigned size)
EXPORT_SYMBOL(__breadahead);

/**
- * __bread() - reads a specified block and returns the bh
+ * __bread_gfp() - reads a specified block and returns the bh
* @bdev: the block_device to read from
* @block: number of block
* @size: size (in bytes) to read
- *
+ * @gfp: page allocation flag
+ *
* Reads a specified block, and returns buffer head that contains it.
+ * The page cache can be allocated from non-movable area
+ * not to prevent page migration if you set gfp to zero.
* It returns NULL if the block was unreadable.
*/
struct buffer_head *
-__bread(struct block_device *bdev, sector_t block, unsigned size)
+__bread_gfp(struct block_device *bdev, sector_t block,
+ unsigned size, gfp_t gfp)
{
- struct buffer_head *bh = __getblk(bdev, block, size);
+ struct buffer_head *bh = __getblk_gfp(bdev, block, size, gfp);

if (likely(bh) && !buffer_uptodate(bh))
bh = __bread_slow(bh);
return bh;
}
-EXPORT_SYMBOL(__bread);
+EXPORT_SYMBOL(__bread_gfp);

/*
* invalidate_bh_lrus() is called rarely - but not only at unmount.
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 458f497..1738420 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -166,12 +166,13 @@ void __wait_on_buffer(struct buffer_head *);
wait_queue_head_t *bh_waitq_head(struct buffer_head *bh);
struct buffer_head *__find_get_block(struct block_device *bdev, sector_t block,
unsigned size);
-struct buffer_head *__getblk(struct block_device *bdev, sector_t block,
- unsigned size);
+struct buffer_head *__getblk_gfp(struct block_device *bdev, sector_t block,
+ unsigned size, gfp_t gfp);
void __brelse(struct buffer_head *);
void __bforget(struct buffer_head *);
void __breadahead(struct block_device *, sector_t block, unsigned int size);
-struct buffer_head *__bread(struct block_device *, sector_t block, unsigned size);
+struct buffer_head *__bread_gfp(struct block_device *,
+ sector_t block, unsigned size, gfp_t gfp);
void invalidate_bh_lrus(void);
struct buffer_head *alloc_buffer_head(gfp_t gfp_flags);
void free_buffer_head(struct buffer_head * bh);
@@ -286,7 +287,13 @@ static inline void bforget(struct buffer_head *bh)
static inline struct buffer_head *
sb_bread(struct super_block *sb, sector_t block)
{
- return __bread(sb->s_bdev, block, sb->s_blocksize);
+ return __bread_gfp(sb->s_bdev, block, sb->s_blocksize, __GFP_MOVABLE);
+}
+
+static inline struct buffer_head *
+sb_bread_unmovable(struct super_block *sb, sector_t block)
+{
+ return __bread_gfp(sb->s_bdev, block, sb->s_blocksize, 0);
}

static inline void
@@ -298,7 +305,7 @@ sb_breadahead(struct super_block *sb, sector_t block)
static inline struct buffer_head *
sb_getblk(struct super_block *sb, sector_t block)
{
- return __getblk(sb->s_bdev, block, sb->s_blocksize);
+ return __getblk_gfp(sb->s_bdev, block, sb->s_blocksize, __GFP_MOVABLE);
}

static inline struct buffer_head *
@@ -335,6 +342,36 @@ static inline void lock_buffer(struct buffer_head *bh)
__lock_buffer(bh);
}

+static inline struct buffer_head *getblk_unmovable(struct block_device *bdev,
+ sector_t block,
+ unsigned size)
+{
+ return __getblk_gfp(bdev, block, size, 0);
+}
+
+static inline struct buffer_head *__getblk(struct block_device *bdev,
+ sector_t block,
+ unsigned size)
+{
+ return __getblk_gfp(bdev, block, size, __GFP_MOVABLE);
+}
+
+/**
+ * __bread() - reads a specified block and returns the bh
+ * @bdev: the block_device to read from
+ * @block: number of block
+ * @size: size (in bytes) to read
+ *
+ * Reads a specified block, and returns buffer head that contains it.
+ * The page cache is allocated from movable area so that it can be migrated.
+ * It returns NULL if the block was unreadable.
+ */
+static inline struct buffer_head *
+__bread(struct block_device *bdev, sector_t block, unsigned size)
+{
+ return __bread_gfp(bdev, block, size, __GFP_MOVABLE);
+}
+
extern int __set_page_dirty_buffers(struct page *page);

#else /* CONFIG_BLOCK */
--
1.9.1

2016-03-16 10:47:35

by Richard Stearn

[permalink] [raw]
Subject: Re: [PATCH 3.4 030/107] NET: AX.25: Stop heartbeat timer on disconnect.

[email protected] wrote:
> From: Richard Stearn <[email protected]>
>
> 3.4.111-rc1 review patch. If anyone has any objections, please let me know.

Hi

This patch should _not_ be applied.
Breaks other parts of the driver.
Investigations on-going.

Apologies for the disruption.
--
Regards
Richard

2016-03-16 14:09:53

by Don Zickus

[permalink] [raw]
Subject: Re: [PATCH 3.4 098/107] kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one

On Wed, Mar 16, 2016 at 04:06:32PM +0800, [email protected] wrote:
> From: Ben Zhang <[email protected]>
>
> 3.4.111-rc1 review patch. If anyone has any objections, please let me know.

Just an FYI below, this patch won't work the way it was integrated..

comments below

>
> ------------------
>
>
> commit 62572e29bc530b38921ef6059088b4788a9832a5 upstream.
>
> I ran into a scenario where while one cpu was stuck and should have
> panic'd because of the NMI watchdog, it didn't. The reason was another
> cpu was spewing stack dumps on to the console. Upon investigation, I
> noticed that when writing to the console and also when dumping the
> stack, the watchdog is touched.
>
> This causes all the cpus to reset their NMI watchdog flags and the
> 'stuck' cpu just spins forever.
>
> This change causes the semantics of touch_nmi_watchdog to be changed
> slightly. Previously, I accidentally changed the semantics and we
> noticed there was a codepath in which touch_nmi_watchdog could be
> touched from a preemtible area. That caused a BUG() to happen when
> CONFIG_DEBUG_PREEMPT was enabled. I believe it was the acpi code.
>
> My attempt here re-introduces the change to have the
> touch_nmi_watchdog() code only touch the local cpu instead of all of the
> cpus. But instead of using __get_cpu_var(), I use the
> __raw_get_cpu_var() version.
>
> This avoids the preemption problem. However my reasoning wasn't because
> I was trying to be lazy. Instead I rationalized it as, well if
> preemption is enabled then interrupts should be enabled to and the NMI
> watchdog will have no reason to trigger. So it won't matter if the
> wrong cpu is touched because the percpu interrupt counters the NMI
> watchdog uses should still be incrementing.
>
> Don said:
>
> : I'm ok with this patch, though it does alter the behaviour of how
> : touch_nmi_watchdog works. For the most part I don't think most callers
> : need to touch all of the watchdogs (on each cpu). Perhaps a corner case
> : will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ).
> :
> : But this does address an issue where if a system is locked up and one cpu
> : is spewing out useful debug messages (or error messages), the hard lockup
> : will fail to go off. We have seen this on RHEL also.
>
> Signed-off-by: Don Zickus <[email protected]>
> Signed-off-by: Ben Zhang <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
> [lizf: Backported to 3.4: adjust context]
> Signed-off-by: Zefan Li <[email protected]>
> ---
> kernel/watchdog.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 991aa93..7527c8c 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -162,6 +162,14 @@ void touch_nmi_watchdog(void)
> per_cpu(watchdog_nmi_touch, cpu) = true;
> }
> }

The above for-loop was to be replaced by the non-for-loop below.

The above for-loop is the problem this patch was solving, so keeping it
around does not solve anything. :-)


> + /*
> + * Using __raw here because some code paths have
> + * preemption enabled. If preemption is enabled
> + * then interrupts should be enabled too, in which
> + * case we shouldn't have to worry about the watchdog
> + * going off.
> + */
> + __raw_get_cpu_var(watchdog_nmi_touch) = true;
> touch_softlockup_watchdog();
> }
> EXPORT_SYMBOL(touch_nmi_watchdog);

Cheers,
Don

2016-03-16 17:51:40

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 3.4 000/107] 3.4.111-rc1 review

On Wed, Mar 16, 2016 at 04:05:41PM +0800, [email protected] wrote:
> From: Zefan Li <[email protected]>
>
> This is the start of the stable review cycle for the 3.4.111 release.
> There are 107 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Mar 16 16:04:10 CST 2016.
> Anything received after that time might be too late.
>
Build results:
total: 98 pass: 98 fail: 0
Qemu test results:
total: 63 pass: 63 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

2016-03-17 01:19:25

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 3.4 030/107] NET: AX.25: Stop heartbeat timer on disconnect.

On 2016/3/16 18:40, Richard Stearn wrote:
> [email protected] wrote:
>> From: Richard Stearn <[email protected]>
>>
>> 3.4.111-rc1 review patch. If anyone has any objections, please let me know.
>
> Hi
>
> This patch should _not_ be applied.
> Breaks other parts of the driver.
> Investigations on-going.
>
> Apologies for the disruption.

Thanks! I'll drop it.

2016-03-17 01:21:02

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 3.4 098/107] kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one

>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 991aa93..7527c8c 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -162,6 +162,14 @@ void touch_nmi_watchdog(void)
>> per_cpu(watchdog_nmi_touch, cpu) = true;
>> }
>> }
>
> The above for-loop was to be replaced by the non-for-loop below.
>
> The above for-loop is the problem this patch was solving, so keeping it
> around does not solve anything. :-)
>

OOps, my fault. I'll remove this for-loop. Thanks for your review!

>
>> + /*
>> + * Using __raw here because some code paths have
>> + * preemption enabled. If preemption is enabled
>> + * then interrupts should be enabled too, in which
>> + * case we shouldn't have to worry about the watchdog
>> + * going off.
>> + */
>> + __raw_get_cpu_var(watchdog_nmi_touch) = true;
>> touch_softlockup_watchdog();
>> }
>> EXPORT_SYMBOL(touch_nmi_watchdog);
>
> Cheers,
> Don
> .
>

2016-03-17 01:21:59

by Zefan Li

[permalink] [raw]
Subject: Re: [PATCH 3.4 000/107] 3.4.111-rc1 review

On 2016/3/17 1:51, Guenter Roeck wrote:
> On Wed, Mar 16, 2016 at 04:05:41PM +0800, [email protected] wrote:
>> From: Zefan Li <[email protected]>
>>
>> This is the start of the stable review cycle for the 3.4.111 release.
>> There are 107 patches in this series, all will be posted as a response
>> to this one. If anyone has any issues with these being applied, please
>> let me know.
>>
>> Responses should be made by Wed Mar 16 16:04:10 CST 2016.
>> Anything received after that time might be too late.
>>
> Build results:
> total: 98 pass: 98 fail: 0
> Qemu test results:
> total: 63 pass: 63 fail: 0
>
> Details are available at http://kerneltests.org/builders.
>

Thanks for testing!