<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch, branch v4.9.69</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>sparc64/mm: set fields in deferred pages</title>
<updated>2017-12-14T08:28:23+00:00</updated>
<author>
<name>Pavel Tatashin</name>
<email>pasha.tatashin@oracle.com</email>
</author>
<published>2017-11-16T01:36:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7da67d1d98f609b0f24fb6c1a317523c965898b7'/>
<id>7da67d1d98f609b0f24fb6c1a317523c965898b7</id>
<content type='text'>
[ Upstream commit 2a20aa171071a334d80c4e5d5af719d8374702fc ]

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:

mem_init() {
     register_page_bootmem_info();
     free_all_bootmem();
     ...
}

When register_page_bootmem_info() is called only non-deferred struct
pages are initialized.  But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page-&gt;freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() &lt;- Only if this is deferred reserved page
    __init_single_pfn()
     __init_single_page()
      memset(0) &lt;-- Loose the set fields here

We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed.  But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin &lt;pasha.tatashin@oracle.com&gt;
Reviewed-by: Steven Sistare &lt;steven.sistare@oracle.com&gt;
Reviewed-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Reviewed-by: Bob Picco &lt;bob.picco@oracle.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2a20aa171071a334d80c4e5d5af719d8374702fc ]

Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT),
flags and other fields in "struct page"es are never changed prior to
first initializing struct pages by going through __init_single_page().

With deferred struct page feature enabled there is a case where we set
some fields prior to initializing:

mem_init() {
     register_page_bootmem_info();
     free_all_bootmem();
     ...
}

When register_page_bootmem_info() is called only non-deferred struct
pages are initialized.  But, this function goes through some reserved
pages which might be part of the deferred, and thus are not yet
initialized.

mem_init
register_page_bootmem_info
register_page_bootmem_info_node
 get_page_bootmem
  .. setting fields here ..
  such as: page-&gt;freelist = (void *)type;

free_all_bootmem()
free_low_memory_core_early()
 for_each_reserved_mem_region()
  reserve_bootmem_region()
   init_reserved_page() &lt;- Only if this is deferred reserved page
    __init_single_pfn()
     __init_single_page()
      memset(0) &lt;-- Loose the set fields here

We end up with similar issue as in the previous patch, where currently
we do not observe problem as memory is zeroed.  But, if flag asserts are
changed we can start hitting issues.

Also, because in this patch series we will stop zeroing struct page
memory during allocation, we must make sure that struct pages are
properly initialized prior to using them.

The deferred-reserved pages are initialized in free_all_bootmem().
Therefore, the fix is to switch the above calls.

Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin &lt;pasha.tatashin@oracle.com&gt;
Reviewed-by: Steven Sistare &lt;steven.sistare@oracle.com&gt;
Reviewed-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Reviewed-by: Bob Picco &lt;bob.picco@oracle.com&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Sam Ravnborg &lt;sam@ravnborg.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested</title>
<updated>2017-12-14T08:28:21+00:00</updated>
<author>
<name>Alexey Kardashevskiy</name>
<email>aik@ozlabs.ru</email>
</author>
<published>2017-02-22T04:43:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d28faebe8b091bec9bdb287b4ab029bee3a7d716'/>
<id>d28faebe8b091bec9bdb287b4ab029bee3a7d716</id>
<content type='text'>
[ Upstream commit 7aafac11e308d37ed3c509829bb43d80c1811ac3 ]

The IODA2 specification says that a 64 DMA address cannot use top 4 bits
(3 are reserved and one is a "TVE select"); bottom page_shift bits
cannot be used for multilevel table addressing either.

The existing IODA2 table allocation code aligns the minimum TCE table
size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
13 bits, the maximum number of levels is 48/13 = 3 so we physically
cannot address more and EEH happens on DMA accesses.

This adds a check that too many levels were requested.

It is still possible to have 5 levels in the case of 4K system page size.

Signed-off-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Acked-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7aafac11e308d37ed3c509829bb43d80c1811ac3 ]

The IODA2 specification says that a 64 DMA address cannot use top 4 bits
(3 are reserved and one is a "TVE select"); bottom page_shift bits
cannot be used for multilevel table addressing either.

The existing IODA2 table allocation code aligns the minimum TCE table
size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
13 bits, the maximum number of levels is 48/13 = 3 so we physically
cannot address more and EEH happens on DMA accesses.

This adds a check that too many levels were requested.

It is still possible to have 5 levels in the case of 4K system page size.

Signed-off-by: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;
Acked-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>axonram: Fix gendisk handling</title>
<updated>2017-12-14T08:28:21+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-03-08T13:56:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=e8d5b115c6ea552d720315b4d6413ea6650d9966'/>
<id>e8d5b115c6ea552d720315b4d6413ea6650d9966</id>
<content type='text'>
[ Upstream commit 672a2c87c83649fb0167202342ce85af9a3b4f1c ]

It is invalid to call del_gendisk() when disk-&gt;queue is NULL. Fix error
handling in axon_ram_probe() to avoid doing that.

Also del_gendisk() does not drop a reference to gendisk allocated by
alloc_disk(). That has to be done by put_disk(). Add that call where
needed.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 672a2c87c83649fb0167202342ce85af9a3b4f1c ]

It is invalid to call del_gendisk() when disk-&gt;queue is NULL. Fix error
handling in axon_ram_probe() to avoid doing that.

Also del_gendisk() does not drop a reference to gendisk allocated by
alloc_disk(). That has to be done by put_disk(). Add that call where
needed.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: KVM: Survive unknown traps from guests</title>
<updated>2017-12-14T08:28:19+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2017-02-20T12:30:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=aafb72d2ed5579208191cfc3fb9425570c1303ce'/>
<id>aafb72d2ed5579208191cfc3fb9425570c1303ce</id>
<content type='text'>
[ Upstream commit ba4dd156eabdca93501d92a980ba27fa5f4bbd27 ]

Currently we BUG() if we see an ESR_EL2.EC value we don't recognise. As
configurable disables/enables are added to the architecture (controlled
by RES1/RES0 bits respectively), with associated synchronous exceptions,
it may be possible for a guest to trigger exceptions with classes that
we don't recognise.

While we can't service these exceptions in a manner useful to the guest,
we can avoid bringing down the host. Per ARM DDI 0487A.k_iss10775, page
D7-1937, EC values within the range 0x00 - 0x2c are reserved for future
use with synchronous exceptions, and EC values within the range 0x2d -
0x3f may be used for either synchronous or asynchronous exceptions.

The patch makes KVM handle any unknown EC by injecting an UNDEFINED
exception into the guest, with a corresponding (ratelimited) warning in
the host dmesg. We could later improve on this with with a new (opt-in)
exit to the host userspace.

Cc: Dave Martin &lt;dave.martin@arm.com&gt;
Cc: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Reviewed-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit ba4dd156eabdca93501d92a980ba27fa5f4bbd27 ]

Currently we BUG() if we see an ESR_EL2.EC value we don't recognise. As
configurable disables/enables are added to the architecture (controlled
by RES1/RES0 bits respectively), with associated synchronous exceptions,
it may be possible for a guest to trigger exceptions with classes that
we don't recognise.

While we can't service these exceptions in a manner useful to the guest,
we can avoid bringing down the host. Per ARM DDI 0487A.k_iss10775, page
D7-1937, EC values within the range 0x00 - 0x2c are reserved for future
use with synchronous exceptions, and EC values within the range 0x2d -
0x3f may be used for either synchronous or asynchronous exceptions.

The patch makes KVM handle any unknown EC by injecting an UNDEFINED
exception into the guest, with a corresponding (ratelimited) warning in
the host dmesg. We could later improve on this with with a new (opt-in)
exit to the host userspace.

Cc: Dave Martin &lt;dave.martin@arm.com&gt;
Cc: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Reviewed-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm: KVM: Survive unknown traps from guests</title>
<updated>2017-12-14T08:28:19+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2017-02-20T12:30:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=5e366aaec185c83157a126dad227a75e59899c62'/>
<id>5e366aaec185c83157a126dad227a75e59899c62</id>
<content type='text'>
[ Upstream commit f050fe7a9164945dd1c28be05bf00e8cfb082ccf ]

Currently we BUG() if we see a HSR.EC value we don't recognise. As
configurable disables/enables are added to the architecture (controlled
by RES1/RES0 bits respectively), with associated synchronous exceptions,
it may be possible for a guest to trigger exceptions with classes that
we don't recognise.

While we can't service these exceptions in a manner useful to the guest,
we can avoid bringing down the host. Per ARM DDI 0406C.c, all currently
unallocated HSR EC encodings are reserved, and per ARM DDI
0487A.k_iss10775, page G6-4395, EC values within the range 0x00 - 0x2c
are reserved for future use with synchronous exceptions, and EC values
within the range 0x2d - 0x3f may be used for either synchronous or
asynchronous exceptions.

The patch makes KVM handle any unknown EC by injecting an UNDEFINED
exception into the guest, with a corresponding (ratelimited) warning in
the host dmesg. We could later improve on this with with a new (opt-in)
exit to the host userspace.

Cc: Dave Martin &lt;dave.martin@arm.com&gt;
Cc: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Reviewed-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f050fe7a9164945dd1c28be05bf00e8cfb082ccf ]

Currently we BUG() if we see a HSR.EC value we don't recognise. As
configurable disables/enables are added to the architecture (controlled
by RES1/RES0 bits respectively), with associated synchronous exceptions,
it may be possible for a guest to trigger exceptions with classes that
we don't recognise.

While we can't service these exceptions in a manner useful to the guest,
we can avoid bringing down the host. Per ARM DDI 0406C.c, all currently
unallocated HSR EC encodings are reserved, and per ARM DDI
0487A.k_iss10775, page G6-4395, EC values within the range 0x00 - 0x2c
are reserved for future use with synchronous exceptions, and EC values
within the range 0x2d - 0x3f may be used for either synchronous or
asynchronous exceptions.

The patch makes KVM handle any unknown EC by injecting an UNDEFINED
exception into the guest, with a corresponding (ratelimited) warning in
the host dmesg. We could later improve on this with with a new (opt-in)
exit to the host userspace.

Cc: Dave Martin &lt;dave.martin@arm.com&gt;
Cc: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Reviewed-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset</title>
<updated>2017-12-14T08:28:19+00:00</updated>
<author>
<name>Wanpeng Li</name>
<email>wanpeng.li@hotmail.com</email>
</author>
<published>2017-03-06T12:03:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c886f281b9e016f695df0a4d4aa91a2ead93869f'/>
<id>c886f281b9e016f695df0a4d4aa91a2ead93869f</id>
<content type='text'>
[ Upstream commit 2f707d97982286b307ef2a9b034e19aabc1abb56 ]

Reported by syzkaller:

    WARNING: CPU: 1 PID: 27742 at arch/x86/kvm/vmx.c:11029
    nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
    CPU: 1 PID: 27742 Comm: a.out Not tainted 4.10.0+ #229
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:15 [inline]
     dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
     panic+0x1fb/0x412 kernel/panic.c:179
     __warn+0x1c4/0x1e0 kernel/panic.c:540
     warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
     nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
     vmx_leave_nested arch/x86/kvm/vmx.c:11136 [inline]
     vmx_set_msr+0x1565/0x1910 arch/x86/kvm/vmx.c:3324
     kvm_set_msr+0xd4/0x170 arch/x86/kvm/x86.c:1099
     do_set_msr+0x11e/0x190 arch/x86/kvm/x86.c:1128
     __msr_io arch/x86/kvm/x86.c:2577 [inline]
     msr_io+0x24b/0x450 arch/x86/kvm/x86.c:2614
     kvm_arch_vcpu_ioctl+0x35b/0x46a0 arch/x86/kvm/x86.c:3497
     kvm_vcpu_ioctl+0x232/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2721
     vfs_ioctl fs/ioctl.c:43 [inline]
     do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
     SYSC_ioctl fs/ioctl.c:698 [inline]
     SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
     entry_SYSCALL_64_fastpath+0x1f/0xc2

The syzkaller folks reported a nested_run_pending warning during userspace
clear VMX capability which is exposed to L1 before.

The warning gets thrown while doing

(*(uint32_t*)0x20aecfe8 = (uint32_t)0x1);
(*(uint32_t*)0x20aecfec = (uint32_t)0x0);
(*(uint32_t*)0x20aecff0 = (uint32_t)0x3a);
(*(uint32_t*)0x20aecff4 = (uint32_t)0x0);
(*(uint64_t*)0x20aecff8 = (uint64_t)0x0);
r[29] = syscall(__NR_ioctl, r[4], 0x4008ae89ul,
		0x20aecfe8ul, 0, 0, 0, 0, 0, 0);

i.e. KVM_SET_MSR ioctl with

struct kvm_msrs {
	.nmsrs = 1,
		.pad = 0,
		.entries = {
			{.index = MSR_IA32_FEATURE_CONTROL,
			 .reserved = 0,
			 .data = 0}
		}
}

The VMLANCH/VMRESUME emulation should be stopped since the CPU is going to
reset here. This patch resets the nested_run_pending since the CPU is going
to be reset hence there should be nothing pending.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Suggested-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Jim Mattson &lt;jmattson@google.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2f707d97982286b307ef2a9b034e19aabc1abb56 ]

Reported by syzkaller:

    WARNING: CPU: 1 PID: 27742 at arch/x86/kvm/vmx.c:11029
    nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
    CPU: 1 PID: 27742 Comm: a.out Not tainted 4.10.0+ #229
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:15 [inline]
     dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
     panic+0x1fb/0x412 kernel/panic.c:179
     __warn+0x1c4/0x1e0 kernel/panic.c:540
     warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
     nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029
     vmx_leave_nested arch/x86/kvm/vmx.c:11136 [inline]
     vmx_set_msr+0x1565/0x1910 arch/x86/kvm/vmx.c:3324
     kvm_set_msr+0xd4/0x170 arch/x86/kvm/x86.c:1099
     do_set_msr+0x11e/0x190 arch/x86/kvm/x86.c:1128
     __msr_io arch/x86/kvm/x86.c:2577 [inline]
     msr_io+0x24b/0x450 arch/x86/kvm/x86.c:2614
     kvm_arch_vcpu_ioctl+0x35b/0x46a0 arch/x86/kvm/x86.c:3497
     kvm_vcpu_ioctl+0x232/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2721
     vfs_ioctl fs/ioctl.c:43 [inline]
     do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
     SYSC_ioctl fs/ioctl.c:698 [inline]
     SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
     entry_SYSCALL_64_fastpath+0x1f/0xc2

The syzkaller folks reported a nested_run_pending warning during userspace
clear VMX capability which is exposed to L1 before.

The warning gets thrown while doing

(*(uint32_t*)0x20aecfe8 = (uint32_t)0x1);
(*(uint32_t*)0x20aecfec = (uint32_t)0x0);
(*(uint32_t*)0x20aecff0 = (uint32_t)0x3a);
(*(uint32_t*)0x20aecff4 = (uint32_t)0x0);
(*(uint64_t*)0x20aecff8 = (uint64_t)0x0);
r[29] = syscall(__NR_ioctl, r[4], 0x4008ae89ul,
		0x20aecfe8ul, 0, 0, 0, 0, 0, 0);

i.e. KVM_SET_MSR ioctl with

struct kvm_msrs {
	.nmsrs = 1,
		.pad = 0,
		.entries = {
			{.index = MSR_IA32_FEATURE_CONTROL,
			 .reserved = 0,
			 .data = 0}
		}
}

The VMLANCH/VMRESUME emulation should be stopped since the CPU is going to
reset here. This patch resets the nested_run_pending since the CPU is going
to be reset hence there should be nothing pending.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Suggested-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Jim Mattson &lt;jmattson@google.com&gt;
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kvm: nVMX: VMCLEAR should not cause the vCPU to shut down</title>
<updated>2017-12-14T08:28:19+00:00</updated>
<author>
<name>Jim Mattson</name>
<email>jmattson@google.com</email>
</author>
<published>2017-03-02T20:41:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=29deec403d5614e67443cf9bf6ca3260c4f97d8a'/>
<id>29deec403d5614e67443cf9bf6ca3260c4f97d8a</id>
<content type='text'>
[ Upstream commit 587d7e72aedca91cee80c0a56811649c3efab765 ]

VMCLEAR should silently ignore a failure to clear the launch state of
the VMCS referenced by the operand.

Signed-off-by: Jim Mattson &lt;jmattson@google.com&gt;
[Changed "kvm_write_guest(vcpu-&gt;kvm" to "kvm_vcpu_write_guest(vcpu".]
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;

Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 587d7e72aedca91cee80c0a56811649c3efab765 ]

VMCLEAR should silently ignore a failure to clear the launch state of
the VMCS referenced by the operand.

Signed-off-by: Jim Mattson &lt;jmattson@google.com&gt;
[Changed "kvm_write_guest(vcpu-&gt;kvm" to "kvm_vcpu_write_guest(vcpu".]
Signed-off-by: Radim Krčmář &lt;rkrcmar@redhat.com&gt;

Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: OMAP2+: Release device node after it is no longer needed.</title>
<updated>2017-12-14T08:28:18+00:00</updated>
<author>
<name>Guenter Roeck</name>
<email>linux@roeck-us.net</email>
</author>
<published>2017-03-04T15:02:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=202c73946455e29db412cd594c0d7e47efcb28fa'/>
<id>202c73946455e29db412cd594c0d7e47efcb28fa</id>
<content type='text'>
[ Upstream commit b92675d998a9fa37fe9e0e35053a95b4a23c158b ]

The device node returned by of_find_node_by_name() needs to be released
after it is no longer needed to avoid a device node leak.

Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b92675d998a9fa37fe9e0e35053a95b4a23c158b ]

The device node returned by of_find_node_by_name() needs to be released
after it is no longer needed to avoid a device node leak.

Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: OMAP2+: Fix device node reference counts</title>
<updated>2017-12-14T08:28:18+00:00</updated>
<author>
<name>Guenter Roeck</name>
<email>linux@roeck-us.net</email>
</author>
<published>2017-03-04T15:02:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d3954c5ced8eb8c8548312f49c9bb28717812057'/>
<id>d3954c5ced8eb8c8548312f49c9bb28717812057</id>
<content type='text'>
[ Upstream commit 10e5778f54765c96fe0c8f104b7a030e5b35bc72 ]

After commit 0549bde0fcb1 ("of: fix of_node leak caused in
of_find_node_opts_by_path"), the following error may be
reported when running omap images.

OF: ERROR: Bad of_node_put() on /ocp@68000000
CPU: 0 PID: 0 Comm: swapper Not tainted 4.10.0-rc7-next-20170210 #1
Hardware name: Generic OMAP3-GP (Flattened Device Tree)
[&lt;c0310604&gt;] (unwind_backtrace) from [&lt;c030bbf4&gt;] (show_stack+0x10/0x14)
[&lt;c030bbf4&gt;] (show_stack) from [&lt;c05add8c&gt;] (dump_stack+0x98/0xac)
[&lt;c05add8c&gt;] (dump_stack) from [&lt;c05af1b0&gt;] (kobject_release+0x48/0x7c)
[&lt;c05af1b0&gt;] (kobject_release)
	from [&lt;c0ad1aa4&gt;] (of_find_node_by_name+0x74/0x94)
[&lt;c0ad1aa4&gt;] (of_find_node_by_name)
	from [&lt;c1215bd4&gt;] (omap3xxx_hwmod_is_hs_ip_block_usable+0x24/0x2c)
[&lt;c1215bd4&gt;] (omap3xxx_hwmod_is_hs_ip_block_usable) from
[&lt;c1215d5c&gt;] (omap3xxx_hwmod_init+0x180/0x274)
[&lt;c1215d5c&gt;] (omap3xxx_hwmod_init)
	from [&lt;c120faa8&gt;] (omap3_init_early+0xa0/0x11c)
[&lt;c120faa8&gt;] (omap3_init_early)
	from [&lt;c120fb2c&gt;] (omap3430_init_early+0x8/0x30)
[&lt;c120fb2c&gt;] (omap3430_init_early)
	from [&lt;c1204710&gt;] (setup_arch+0xc04/0xc34)
[&lt;c1204710&gt;] (setup_arch) from [&lt;c1200948&gt;] (start_kernel+0x68/0x38c)
[&lt;c1200948&gt;] (start_kernel) from [&lt;8020807c&gt;] (0x8020807c)

of_find_node_by_name() drops the reference to the passed device node.
The commit referenced above exposes this problem.

To fix the problem, use of_get_child_by_name() instead of
of_find_node_by_name(); of_get_child_by_name() does not drop
the reference count of passed device nodes. While semantically
different, we only look for immediate children of the passed
device node, so of_get_child_by_name() is a more appropriate
function to use anyway.

Release the reference to the device node obtained with
of_get_child_by_name() after it is no longer needed to avoid
another device node leak.

While at it, clean up the code and change the return type of
omap3xxx_hwmod_is_hs_ip_block_usable() to bool to match its use
and the return type of of_device_is_available().

Cc: Qi Hou &lt;qi.hou@windriver.com&gt;
Cc: Peter Rosin &lt;peda@axentia.se&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 10e5778f54765c96fe0c8f104b7a030e5b35bc72 ]

After commit 0549bde0fcb1 ("of: fix of_node leak caused in
of_find_node_opts_by_path"), the following error may be
reported when running omap images.

OF: ERROR: Bad of_node_put() on /ocp@68000000
CPU: 0 PID: 0 Comm: swapper Not tainted 4.10.0-rc7-next-20170210 #1
Hardware name: Generic OMAP3-GP (Flattened Device Tree)
[&lt;c0310604&gt;] (unwind_backtrace) from [&lt;c030bbf4&gt;] (show_stack+0x10/0x14)
[&lt;c030bbf4&gt;] (show_stack) from [&lt;c05add8c&gt;] (dump_stack+0x98/0xac)
[&lt;c05add8c&gt;] (dump_stack) from [&lt;c05af1b0&gt;] (kobject_release+0x48/0x7c)
[&lt;c05af1b0&gt;] (kobject_release)
	from [&lt;c0ad1aa4&gt;] (of_find_node_by_name+0x74/0x94)
[&lt;c0ad1aa4&gt;] (of_find_node_by_name)
	from [&lt;c1215bd4&gt;] (omap3xxx_hwmod_is_hs_ip_block_usable+0x24/0x2c)
[&lt;c1215bd4&gt;] (omap3xxx_hwmod_is_hs_ip_block_usable) from
[&lt;c1215d5c&gt;] (omap3xxx_hwmod_init+0x180/0x274)
[&lt;c1215d5c&gt;] (omap3xxx_hwmod_init)
	from [&lt;c120faa8&gt;] (omap3_init_early+0xa0/0x11c)
[&lt;c120faa8&gt;] (omap3_init_early)
	from [&lt;c120fb2c&gt;] (omap3430_init_early+0x8/0x30)
[&lt;c120fb2c&gt;] (omap3430_init_early)
	from [&lt;c1204710&gt;] (setup_arch+0xc04/0xc34)
[&lt;c1204710&gt;] (setup_arch) from [&lt;c1200948&gt;] (start_kernel+0x68/0x38c)
[&lt;c1200948&gt;] (start_kernel) from [&lt;8020807c&gt;] (0x8020807c)

of_find_node_by_name() drops the reference to the passed device node.
The commit referenced above exposes this problem.

To fix the problem, use of_get_child_by_name() instead of
of_find_node_by_name(); of_get_child_by_name() does not drop
the reference count of passed device nodes. While semantically
different, we only look for immediate children of the passed
device node, so of_get_child_by_name() is a more appropriate
function to use anyway.

Release the reference to the device node obtained with
of_get_child_by_name() after it is no longer needed to avoid
another device node leak.

While at it, clean up the code and change the return type of
omap3xxx_hwmod_is_hs_ip_block_usable() to bool to match its use
and the return type of of_device_is_available().

Cc: Qi Hou &lt;qi.hou@windriver.com&gt;
Cc: Peter Rosin &lt;peda@axentia.se&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: Fix checksum folding in csum_add()</title>
<updated>2017-12-14T08:28:17+00:00</updated>
<author>
<name>Shile Zhang</name>
<email>shile.zhang@nokia.com</email>
</author>
<published>2017-02-04T09:03:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=4aff028047156e58b08b06346a19063c723c565a'/>
<id>4aff028047156e58b08b06346a19063c723c565a</id>
<content type='text'>
[ Upstream commit 6ad966d7303b70165228dba1ee8da1a05c10eefe ]

Paul's patch to fix checksum folding, commit b492f7e4e07a ("powerpc/64:
Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold")
missed a case in csum_add(). Fix it.

Signed-off-by: Shile Zhang &lt;shile.zhang@nokia.com&gt;
Acked-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 6ad966d7303b70165228dba1ee8da1a05c10eefe ]

Paul's patch to fix checksum folding, commit b492f7e4e07a ("powerpc/64:
Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold")
missed a case in csum_add(). Fix it.

Signed-off-by: Shile Zhang &lt;shile.zhang@nokia.com&gt;
Acked-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
