<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/mm/internal.h, branch v3.18.72</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>mm: use 'unsigned int' for page order</title>
<updated>2016-04-18T12:49:40+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-11-07T00:29:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=972c6ab7bf6590ccfd3904528de6cc72fefdecc9'/>
<id>972c6ab7bf6590ccfd3904528de6cc72fefdecc9</id>
<content type='text'>
[ Upstream commit d00181b96eb86c914cb327d1de974a1b71366e1b ]

Let's try to be consistent about data type of page order.

[sfr@canb.auug.org.au: fix build (type of pageblock_order)]
[hughd@google.com: some configs end up with MAX_ORDER and pageblock_order having different types]
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d00181b96eb86c914cb327d1de974a1b71366e1b ]

Let's try to be consistent about data type of page order.

[sfr@canb.auug.org.au: fix build (type of pageblock_order)]
[hughd@google.com: some configs end up with MAX_ORDER and pageblock_order having different types]
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: page_alloc: pass PFN to __free_pages_bootmem</title>
<updated>2016-04-18T12:49:40+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-06-30T21:56:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=302ea3eccb3317a4f4a04e00b6d7a1e811f87fa0'/>
<id>302ea3eccb3317a4f4a04e00b6d7a1e811f87fa0</id>
<content type='text'>
[ Upstream commit d70ddd7a5d9aa335f9b4b0c3d879e1e70ee1e4e3 ]

__free_pages_bootmem prepares a page for release to the buddy allocator
and assumes that the struct page is initialised.  Parallel initialisation
of struct pages defers initialisation and __free_pages_bootmem can be
called for struct pages that cannot yet map struct page to PFN.  This
patch passes PFN to __free_pages_bootmem with no other functional change.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Tested-by: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Tested-by: Waiman Long &lt;waiman.long@hp.com&gt;
Tested-by: Daniel J Blueman &lt;daniel@numascale.com&gt;
Acked-by: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Robin Holt &lt;robinmholt@gmail.com&gt;
Cc: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Waiman Long &lt;waiman.long@hp.com&gt;
Cc: Scott Norton &lt;scott.norton@hp.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d70ddd7a5d9aa335f9b4b0c3d879e1e70ee1e4e3 ]

__free_pages_bootmem prepares a page for release to the buddy allocator
and assumes that the struct page is initialised.  Parallel initialisation
of struct pages defers initialisation and __free_pages_bootmem can be
called for struct pages that cannot yet map struct page to PFN.  This
patch passes PFN to __free_pages_bootmem with no other functional change.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Tested-by: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Tested-by: Waiman Long &lt;waiman.long@hp.com&gt;
Tested-by: Daniel J Blueman &lt;daniel@numascale.com&gt;
Acked-by: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Robin Holt &lt;robinmholt@gmail.com&gt;
Cc: Nate Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Waiman Long &lt;waiman.long@hp.com&gt;
Cc: Scott Norton &lt;scott.norton@hp.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/page_alloc: restrict max order of merging on isolated pageblock</title>
<updated>2014-11-14T00:17:05+00:00</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2014-11-13T23:19:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=3c605096d3158216ba9326a16266f6ba128c2c8d'/>
<id>3c605096d3158216ba9326a16266f6ba128c2c8d</id>
<content type='text'>
Current pageblock isolation logic could isolate each pageblock
individually.  This causes freepage accounting problem if freepage with
pageblock order on isolate pageblock is merged with other freepage on
normal pageblock.  We can prevent merging by restricting max order of
merging to pageblock order if freepage is on isolate pageblock.

A side-effect of this change is that there could be non-merged buddy
freepage even if finishing pageblock isolation, because undoing
pageblock isolation is just to move freepage from isolate buddy list to
normal buddy list rather than to consider merging.  So, the patch also
makes undoing pageblock isolation consider freepage merge.  When
un-isolation, freepage with more than pageblock order and it's buddy are
checked.  If they are on normal pageblock, instead of just moving, we
isolate the freepage and free it in order to get merged.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Cc: Wen Congyang &lt;wency@cn.fujitsu.com&gt;
Cc: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Laura Abbott &lt;lauraa@codeaurora.org&gt;
Cc: Heesub Shin &lt;heesub.shin@samsung.com&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Ritesh Harjani &lt;ritesh.list@gmail.com&gt;
Cc: Gioh Kim &lt;gioh.kim@lge.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current pageblock isolation logic could isolate each pageblock
individually.  This causes freepage accounting problem if freepage with
pageblock order on isolate pageblock is merged with other freepage on
normal pageblock.  We can prevent merging by restricting max order of
merging to pageblock order if freepage is on isolate pageblock.

A side-effect of this change is that there could be non-merged buddy
freepage even if finishing pageblock isolation, because undoing
pageblock isolation is just to move freepage from isolate buddy list to
normal buddy list rather than to consider merging.  So, the patch also
makes undoing pageblock isolation consider freepage merge.  When
un-isolation, freepage with more than pageblock order and it's buddy are
checked.  If they are on normal pageblock, instead of just moving, we
isolate the freepage and free it in order to get merged.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Cc: Wen Congyang &lt;wency@cn.fujitsu.com&gt;
Cc: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Laura Abbott &lt;lauraa@codeaurora.org&gt;
Cc: Heesub Shin &lt;heesub.shin@samsung.com&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Ritesh Harjani &lt;ritesh.list@gmail.com&gt;
Cc: Gioh Kim &lt;gioh.kim@lge.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, compaction: pass gfp mask to compact_control</title>
<updated>2014-10-10T02:25:55+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2014-10-09T22:27:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6d7ce55940b6ecd463ca044ad241f0122d913293'/>
<id>6d7ce55940b6ecd463ca044ad241f0122d913293</id>
<content type='text'>
struct compact_control currently converts the gfp mask to a migratetype,
but we need the entire gfp mask in a follow-up patch.

Pass the entire gfp mask as part of struct compact_control.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
struct compact_control currently converts the gfp mask to a migratetype,
but we need the entire gfp mask in a follow-up patch.

Pass the entire gfp mask as part of struct compact_control.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, compaction: skip buddy pages by their order in the migrate scanner</title>
<updated>2014-10-10T02:25:54+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2014-10-09T22:27:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=99c0fd5e51c447917264154cb01a967804ace745'/>
<id>99c0fd5e51c447917264154cb01a967804ace745</id>
<content type='text'>
The migration scanner skips PageBuddy pages, but does not consider their
order as checking page_order() is generally unsafe without holding the
zone-&gt;lock, and acquiring the lock just for the check wouldn't be a good
tradeoff.

Still, this could avoid some iterations over the rest of the buddy page,
and if we are careful, the race window between PageBuddy() check and
page_order() is small, and the worst thing that can happen is that we skip
too much and miss some isolation candidates.  This is not that bad, as
compaction can already fail for many other reasons like parallel
allocations, and those have much larger race window.

This patch therefore makes the migration scanner obtain the buddy page
order and use it to skip the whole buddy page, if the order appears to be
in the valid range.

It's important that the page_order() is read only once, so that the value
used in the checks and in the pfn calculation is the same.  But in theory
the compiler can replace the local variable by multiple inlines of
page_order().  Therefore, the patch introduces page_order_unsafe() that
uses ACCESS_ONCE to prevent this.

Testing with stress-highalloc from mmtests shows a 15% reduction in number
of pages scanned by migration scanner.  The reduction is &gt;60% with
__GFP_NO_KSWAPD allocations, along with success rates better by few
percent.

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The migration scanner skips PageBuddy pages, but does not consider their
order as checking page_order() is generally unsafe without holding the
zone-&gt;lock, and acquiring the lock just for the check wouldn't be a good
tradeoff.

Still, this could avoid some iterations over the rest of the buddy page,
and if we are careful, the race window between PageBuddy() check and
page_order() is small, and the worst thing that can happen is that we skip
too much and miss some isolation candidates.  This is not that bad, as
compaction can already fail for many other reasons like parallel
allocations, and those have much larger race window.

This patch therefore makes the migration scanner obtain the buddy page
order and use it to skip the whole buddy page, if the order appears to be
in the valid range.

It's important that the page_order() is read only once, so that the value
used in the checks and in the pfn calculation is the same.  But in theory
the compiler can replace the local variable by multiple inlines of
page_order().  Therefore, the patch introduces page_order_unsafe() that
uses ACCESS_ONCE to prevent this.

Testing with stress-highalloc from mmtests shows a 15% reduction in number
of pages scanned by migration scanner.  The reduction is &gt;60% with
__GFP_NO_KSWAPD allocations, along with success rates better by few
percent.

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, compaction: khugepaged should not give up due to need_resched()</title>
<updated>2014-10-10T02:25:54+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2014-10-09T22:27:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=1f9efdef4f3f1d2a073e524113fd0038af636f2b'/>
<id>1f9efdef4f3f1d2a073e524113fd0038af636f2b</id>
<content type='text'>
Async compaction aborts when it detects zone lock contention or
need_resched() is true.  David Rientjes has reported that in practice,
most direct async compactions for THP allocation abort due to
need_resched().  This means that a second direct compaction is never
attempted, which might be OK for a page fault, but khugepaged is intended
to attempt a sync compaction in such case and in these cases it won't.

This patch replaces "bool contended" in compact_control with an int that
distinguishes between aborting due to need_resched() and aborting due to
lock contention.  This allows propagating the abort through all compaction
functions as before, but passing the abort reason up to
__alloc_pages_slowpath() which decides when to continue with direct
reclaim and another compaction attempt.

Another problem is that try_to_compact_pages() did not act upon the
reported contention (both need_resched() or lock contention) immediately
and would proceed with another zone from the zonelist.  When
need_resched() is true, that means initializing another zone compaction,
only to check again need_resched() in isolate_migratepages() and aborting.
 For zone lock contention, the unintended consequence is that the lock
contended status reported back to the allocator is detrmined from the last
zone where compaction was attempted, which is rather arbitrary.

This patch fixes the problem in the following way:
- async compaction of a zone aborting due to need_resched() or fatal signal
  pending means that further zones should not be tried. We report
  COMPACT_CONTENDED_SCHED to the allocator.
- aborting zone compaction due to lock contention means we can still try
  another zone, since it has different set of locks. We report back
  COMPACT_CONTENDED_LOCK only if *all* zones where compaction was attempted,
  it was aborted due to lock contention.

As a result of these fixes, khugepaged will proceed with second sync
compaction as intended, when the preceding async compaction aborted due to
need_resched().  Page fault compactions aborting due to need_resched()
will spare some cycles previously wasted by initializing another zone
compaction only to abort again.  Lock contention will be reported only
when compaction in all zones aborted due to lock contention, and therefore
it's not a good idea to try again after reclaim.

In stress-highalloc from mmtests configured to use __GFP_NO_KSWAPD, this
has improved number of THP collapse allocations by 10%, which shows
positive effect on khugepaged.  The benchmark's success rates are
unchanged as it is not recognized as khugepaged.  Numbers of compact_stall
and compact_fail events have however decreased by 20%, with
compact_success still a bit improved, which is good.  With benchmark
configured not to use __GFP_NO_KSWAPD, there is 6% improvement in THP
collapse allocations, and only slight improvement in stalls and failures.

[akpm@linux-foundation.org: fix warnings]
Reported-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Async compaction aborts when it detects zone lock contention or
need_resched() is true.  David Rientjes has reported that in practice,
most direct async compactions for THP allocation abort due to
need_resched().  This means that a second direct compaction is never
attempted, which might be OK for a page fault, but khugepaged is intended
to attempt a sync compaction in such case and in these cases it won't.

This patch replaces "bool contended" in compact_control with an int that
distinguishes between aborting due to need_resched() and aborting due to
lock contention.  This allows propagating the abort through all compaction
functions as before, but passing the abort reason up to
__alloc_pages_slowpath() which decides when to continue with direct
reclaim and another compaction attempt.

Another problem is that try_to_compact_pages() did not act upon the
reported contention (both need_resched() or lock contention) immediately
and would proceed with another zone from the zonelist.  When
need_resched() is true, that means initializing another zone compaction,
only to check again need_resched() in isolate_migratepages() and aborting.
 For zone lock contention, the unintended consequence is that the lock
contended status reported back to the allocator is detrmined from the last
zone where compaction was attempted, which is rather arbitrary.

This patch fixes the problem in the following way:
- async compaction of a zone aborting due to need_resched() or fatal signal
  pending means that further zones should not be tried. We report
  COMPACT_CONTENDED_SCHED to the allocator.
- aborting zone compaction due to lock contention means we can still try
  another zone, since it has different set of locks. We report back
  COMPACT_CONTENDED_LOCK only if *all* zones where compaction was attempted,
  it was aborted due to lock contention.

As a result of these fixes, khugepaged will proceed with second sync
compaction as intended, when the preceding async compaction aborted due to
need_resched().  Page fault compactions aborting due to need_resched()
will spare some cycles previously wasted by initializing another zone
compaction only to abort again.  Lock contention will be reported only
when compaction in all zones aborted due to lock contention, and therefore
it's not a good idea to try again after reclaim.

In stress-highalloc from mmtests configured to use __GFP_NO_KSWAPD, this
has improved number of THP collapse allocations by 10%, which shows
positive effect on khugepaged.  The benchmark's success rates are
unchanged as it is not recognized as khugepaged.  Numbers of compact_stall
and compact_fail events have however decreased by 20%, with
compact_success still a bit improved, which is good.  With benchmark
configured not to use __GFP_NO_KSWAPD, there is 6% improvement in THP
collapse allocations, and only slight improvement in stalls and failures.

[akpm@linux-foundation.org: fix warnings]
Reported-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, compaction: move pageblock checks up from isolate_migratepages_range()</title>
<updated>2014-10-10T02:25:54+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2014-10-09T22:27:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=edc2ca61249679298c1f343cd9c549964b8df4b4'/>
<id>edc2ca61249679298c1f343cd9c549964b8df4b4</id>
<content type='text'>
isolate_migratepages_range() is the main function of the compaction
scanner, called either on a single pageblock by isolate_migratepages()
during regular compaction, or on an arbitrary range by CMA's
__alloc_contig_migrate_range().  It currently perfoms two pageblock-wide
compaction suitability checks, and because of the CMA callpath, it tracks
if it crossed a pageblock boundary in order to repeat those checks.

However, closer inspection shows that those checks are always true for CMA:
- isolation_suitable() is true because CMA sets cc-&gt;ignore_skip_hint to true
- migrate_async_suitable() check is skipped because CMA uses sync compaction

We can therefore move the compaction-specific checks to
isolate_migratepages() and simplify isolate_migratepages_range().
Furthermore, we can mimic the freepage scanner family of functions, which
has isolate_freepages_block() function called both by compaction from
isolate_freepages() and by CMA from isolate_freepages_range(), where each
use-case adds own specific glue code.  This allows further code
simplification.

Thus, we rename isolate_migratepages_range() to
isolate_migratepages_block() and limit its functionality to a single
pageblock (or its subset).  For CMA, a new different
isolate_migratepages_range() is created as a CMA-specific wrapper for the
_block() function.  The checks specific to compaction are moved to
isolate_migratepages().  As part of the unification of these two families
of functions, we remove the redundant zone parameter where applicable,
since zone pointer is already passed in cc-&gt;zone.

Furthermore, going back to compact_zone() and compact_finished() when
pageblock is found unsuitable (now by isolate_migratepages()) is wasteful
- the checks are meant to skip pageblocks quickly.  The patch therefore
also introduces a simple loop into isolate_migratepages() so that it does
not return immediately on failed pageblock checks, but keeps going until
isolate_migratepages_range() gets called once.  Similarily to
isolate_freepages(), the function periodically checks if it needs to
reschedule or abort async compaction.

[iamjoonsoo.kim@lge.com: fix isolated page counting bug in compaction]
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
isolate_migratepages_range() is the main function of the compaction
scanner, called either on a single pageblock by isolate_migratepages()
during regular compaction, or on an arbitrary range by CMA's
__alloc_contig_migrate_range().  It currently perfoms two pageblock-wide
compaction suitability checks, and because of the CMA callpath, it tracks
if it crossed a pageblock boundary in order to repeat those checks.

However, closer inspection shows that those checks are always true for CMA:
- isolation_suitable() is true because CMA sets cc-&gt;ignore_skip_hint to true
- migrate_async_suitable() check is skipped because CMA uses sync compaction

We can therefore move the compaction-specific checks to
isolate_migratepages() and simplify isolate_migratepages_range().
Furthermore, we can mimic the freepage scanner family of functions, which
has isolate_freepages_block() function called both by compaction from
isolate_freepages() and by CMA from isolate_freepages_range(), where each
use-case adds own specific glue code.  This allows further code
simplification.

Thus, we rename isolate_migratepages_range() to
isolate_migratepages_block() and limit its functionality to a single
pageblock (or its subset).  For CMA, a new different
isolate_migratepages_range() is created as a CMA-specific wrapper for the
_block() function.  The checks specific to compaction are moved to
isolate_migratepages().  As part of the unification of these two families
of functions, we remove the redundant zone parameter where applicable,
since zone pointer is already passed in cc-&gt;zone.

Furthermore, going back to compact_zone() and compact_finished() when
pageblock is found unsuitable (now by isolate_migratepages()) is wasteful
- the checks are meant to skip pageblocks quickly.  The patch therefore
also introduces a simple loop into isolate_migratepages() so that it does
not return immediately on failed pageblock checks, but keeps going until
isolate_migratepages_range() gets called once.  Similarily to
isolate_freepages(), the function periodically checks if it needs to
reschedule or abort async compaction.

[iamjoonsoo.kim@lge.com: fix isolated page counting bug in compaction]
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/internal.h: use nth_page</title>
<updated>2014-08-07T01:01:16+00:00</updated>
<author>
<name>Fabian Frederick</name>
<email>fabf@skynet.be</email>
</author>
<published>2014-08-06T23:05:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=bc7f84c0e67c0ca90b6d0e95cc293ed5d8ad30c4'/>
<id>bc7f84c0e67c0ca90b6d0e95cc293ed5d8ad30c4</id>
<content type='text'>
Use nth_page instead of pfn_to_page(page_to_pfn

Signed-off-by: Fabian Frederick &lt;fabf@skynet.be&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use nth_page instead of pfn_to_page(page_to_pfn

Signed-off-by: Fabian Frederick &lt;fabf@skynet.be&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, compaction: properly signal and act upon lock and need_sched() contention</title>
<updated>2014-06-04T23:54:11+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2014-06-04T23:10:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=be9765722e6b7ece8263cbab857490332339bd6f'/>
<id>be9765722e6b7ece8263cbab857490332339bd6f</id>
<content type='text'>
Compaction uses compact_checklock_irqsave() function to periodically check
for lock contention and need_resched() to either abort async compaction,
or to free the lock, schedule and retake the lock.  When aborting,
cc-&gt;contended is set to signal the contended state to the caller.  Two
problems have been identified in this mechanism.

First, compaction also calls directly cond_resched() in both scanners when
no lock is yet taken.  This call either does not abort async compaction,
or set cc-&gt;contended appropriately.  This patch introduces a new
compact_should_abort() function to achieve both.  In isolate_freepages(),
the check frequency is reduced to once by SWAP_CLUSTER_MAX pageblocks to
match what the migration scanner does in the preliminary page checks.  In
case a pageblock is found suitable for calling isolate_freepages_block(),
the checks within there are done on higher frequency.

Second, isolate_freepages() does not check if isolate_freepages_block()
aborted due to contention, and advances to the next pageblock.  This
violates the principle of aborting on contention, and might result in
pageblocks not being scanned completely, since the scanning cursor is
advanced.  This problem has been noticed in the code by Joonsoo Kim when
reviewing related patches.  This patch makes isolate_freepages_block()
check the cc-&gt;contended flag and abort.

In case isolate_freepages() has already isolated some pages before
aborting due to contention, page migration will proceed, which is OK since
we do not want to waste the work that has been done, and page migration
has own checks for contention.  However, we do not want another isolation
attempt by either of the scanners, so cc-&gt;contended flag check is added
also to compaction_alloc() and compact_finished() to make sure compaction
is aborted right after the migration.

The outcome of the patch should be reduced lock contention by async
compaction and lower latencies for higher-order allocations where direct
compaction is involved.

[akpm@linux-foundation.org: fix typo in comment]
Reported-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Tested-by: Shawn Guo &lt;shawn.guo@linaro.org&gt;
Tested-by: Kevin Hilman &lt;khilman@linaro.org&gt;
Tested-by: Stephen Warren &lt;swarren@nvidia.com&gt;
Tested-by: Fabio Estevam &lt;fabio.estevam@freescale.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Compaction uses compact_checklock_irqsave() function to periodically check
for lock contention and need_resched() to either abort async compaction,
or to free the lock, schedule and retake the lock.  When aborting,
cc-&gt;contended is set to signal the contended state to the caller.  Two
problems have been identified in this mechanism.

First, compaction also calls directly cond_resched() in both scanners when
no lock is yet taken.  This call either does not abort async compaction,
or set cc-&gt;contended appropriately.  This patch introduces a new
compact_should_abort() function to achieve both.  In isolate_freepages(),
the check frequency is reduced to once by SWAP_CLUSTER_MAX pageblocks to
match what the migration scanner does in the preliminary page checks.  In
case a pageblock is found suitable for calling isolate_freepages_block(),
the checks within there are done on higher frequency.

Second, isolate_freepages() does not check if isolate_freepages_block()
aborted due to contention, and advances to the next pageblock.  This
violates the principle of aborting on contention, and might result in
pageblocks not being scanned completely, since the scanning cursor is
advanced.  This problem has been noticed in the code by Joonsoo Kim when
reviewing related patches.  This patch makes isolate_freepages_block()
check the cc-&gt;contended flag and abort.

In case isolate_freepages() has already isolated some pages before
aborting due to contention, page migration will proceed, which is OK since
we do not want to waste the work that has been done, and page migration
has own checks for contention.  However, we do not want another isolation
attempt by either of the scanners, so cc-&gt;contended flag check is added
also to compaction_alloc() and compact_finished() to make sure compaction
is aborted right after the migration.

The outcome of the patch should be reduced lock contention by async
compaction and lower latencies for higher-order allocations where direct
compaction is involved.

[akpm@linux-foundation.org: fix typo in comment]
Reported-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Tested-by: Shawn Guo &lt;shawn.guo@linaro.org&gt;
Tested-by: Kevin Hilman &lt;khilman@linaro.org&gt;
Tested-by: Stephen Warren &lt;swarren@nvidia.com&gt;
Tested-by: Fabio Estevam &lt;fabio.estevam@freescale.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fold mlocked_vma_newpage() into its only call site</title>
<updated>2014-06-04T23:54:07+00:00</updated>
<author>
<name>Jianyu Zhan</name>
<email>nasa4836@gmail.com</email>
</author>
<published>2014-06-04T23:09:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7ee07a44eb53374a73544ae14c71366a02d462e0'/>
<id>7ee07a44eb53374a73544ae14c71366a02d462e0</id>
<content type='text'>
In previous commit(mm: use the light version __mod_zone_page_state in
mlocked_vma_newpage()) a irq-unsafe __mod_zone_page_state is used.  And as
suggested by Andrew, to reduce the risks that new call sites incorrectly
using mlocked_vma_newpage() without knowing they are adding racing, this
patch folds mlocked_vma_newpage() into its only call site,
page_add_new_anon_rmap, to make it open-cocded for people to know what is
going on.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jianyu Zhan &lt;nasa4836@gmail.com&gt;
Suggested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Suggested-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In previous commit(mm: use the light version __mod_zone_page_state in
mlocked_vma_newpage()) a irq-unsafe __mod_zone_page_state is used.  And as
suggested by Andrew, to reduce the risks that new call sites incorrectly
using mlocked_vma_newpage() without knowing they are adding racing, this
patch folds mlocked_vma_newpage() into its only call site,
page_add_new_anon_rmap, to make it open-cocded for people to know what is
going on.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jianyu Zhan &lt;nasa4836@gmail.com&gt;
Suggested-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Suggested-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
