<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/mm, branch v3.15-rc6</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>Merge tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag</title>
<updated>2014-05-20T05:30:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-05-20T05:30:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=41abc90228f98774263572ec99e7ab820f091002'/>
<id>41abc90228f98774263572ec99e7ab820f091002</id>
<content type='text'>
Pull Metag architecture and related fixes from James Hogan:
 "Mostly fixes for metag and parisc relating to upgrowing stacks.

   - Fix missing compiler barriers in metag memory barriers.
   - Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased
     beyond safe value.
   - Make maximum stack size configurable.  This reduces the default
     user stack size back to 80MB (especially on parisc after their
     removal of _STK_LIM_MAX override).  This only affects metag and
     parisc.
   - Remove metag _STK_LIM_MAX override to match other arches and follow
     parisc, now that it is safe to do so (due to the BUG_ON fix
     mentioned above).
   - Finally now that both metag and parisc _STK_LIM_MAX overrides have
     been removed, it makes sense to remove _STK_LIM_MAX altogether"

* tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  asm-generic: remove _STK_LIM_MAX
  metag: Remove _STK_LIM_MAX override
  parisc,metag: Do not hardcode maximum userspace stack size
  metag: Reduce maximum stack size to 256MB
  metag: fix memory barriers
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull Metag architecture and related fixes from James Hogan:
 "Mostly fixes for metag and parisc relating to upgrowing stacks.

   - Fix missing compiler barriers in metag memory barriers.
   - Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased
     beyond safe value.
   - Make maximum stack size configurable.  This reduces the default
     user stack size back to 80MB (especially on parisc after their
     removal of _STK_LIM_MAX override).  This only affects metag and
     parisc.
   - Remove metag _STK_LIM_MAX override to match other arches and follow
     parisc, now that it is safe to do so (due to the BUG_ON fix
     mentioned above).
   - Finally now that both metag and parisc _STK_LIM_MAX overrides have
     been removed, it makes sense to remove _STK_LIM_MAX altogether"

* tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  asm-generic: remove _STK_LIM_MAX
  metag: Remove _STK_LIM_MAX override
  parisc,metag: Do not hardcode maximum userspace stack size
  metag: Reduce maximum stack size to 256MB
  metag: fix memory barriers
</pre>
</div>
</content>
</entry>
<entry>
<title>parisc,metag: Do not hardcode maximum userspace stack size</title>
<updated>2014-05-14T23:01:41+00:00</updated>
<author>
<name>Helge Deller</name>
<email>deller@gmx.de</email>
</author>
<published>2014-04-30T21:26:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=042d27acb64924a0e8a43e972485913a32407beb'/>
<id>042d27acb64924a0e8a43e972485913a32407beb</id>
<content type='text'>
This patch affects only architectures where the stack grows upwards
(currently parisc and metag only). On those do not hardcode the maximum
initial stack size to 1GB for 32-bit processes, but make it configurable
via a config option.

The main problem with the hardcoded stack size is, that we have two
memory regions which grow upwards: stack and heap. To keep most of the
memory available for heap in a flexmap memory layout, it makes no sense
to hard allocate up to 1GB of the memory for stack which can't be used
as heap then.

This patch makes the stack size for 32-bit processes configurable and
uses 80MB as default value which has been in use during the last few
years on parisc and which hasn't showed any problems yet.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: linux-parisc@vger.kernel.org
Cc: linux-metag@vger.kernel.org
Cc: John David Anglin &lt;dave.anglin@bell.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch affects only architectures where the stack grows upwards
(currently parisc and metag only). On those do not hardcode the maximum
initial stack size to 1GB for 32-bit processes, but make it configurable
via a config option.

The main problem with the hardcoded stack size is, that we have two
memory regions which grow upwards: stack and heap. To keep most of the
memory available for heap in a flexmap memory layout, it makes no sense
to hard allocate up to 1GB of the memory for stack which can't be used
as heap then.

This patch makes the stack size for 32-bit processes configurable and
uses 80MB as default value which has been in use during the last few
years on parisc and which hasn't showed any problems yet.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: linux-parisc@vger.kernel.org
Cc: linux-metag@vger.kernel.org
Cc: John David Anglin &lt;dave.anglin@bell.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu</title>
<updated>2014-05-13T02:25:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-05-13T02:25:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=68cb363a4ddc335fddf6e2ccb03e4a907ba7afbf'/>
<id>68cb363a4ddc335fddf6e2ccb03e4a907ba7afbf</id>
<content type='text'>
Pull a percpu fix from Tejun Heo:
 "Fix for a percpu allocator bug where it could try to kfree() a memory
  region allocated using vmalloc().  The bug has been there for years
  now and is unlikely to have ever triggered given the size of struct
  pcpu_chunk.  It's still theoretically possible and the fix is simple
  and safe enough, so the patch is marked with -stable"

* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull a percpu fix from Tejun Heo:
 "Fix for a percpu allocator bug where it could try to kfree() a memory
  region allocated using vmalloc().  The bug has been there for years
  now and is unlikely to have ever triggered given the size of struct
  pcpu_chunk.  It's still theoretically possible and the fix is simple
  and safe enough, so the patch is marked with -stable"

* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree()
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, thp: close race between mremap() and split_huge_page()</title>
<updated>2014-05-11T08:55:48+00:00</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2014-05-09T22:37:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=dd18dbc2d42af75fffa60c77e0f02220bc329829'/>
<id>dd18dbc2d42af75fffa60c77e0f02220bc329829</id>
<content type='text'>
It's critical for split_huge_page() (and migration) to catch and freeze
all PMDs on rmap walk.  It gets tricky if there's concurrent fork() or
mremap() since usually we copy/move page table entries on dup_mm() or
move_page_tables() without rmap lock taken.  To get it work we rely on
rmap walk order to not miss any entry.  We expect to see destination VMA
after source one to work correctly.

But after switching rmap implementation to interval tree it's not always
possible to preserve expected walk order.

It works fine for dup_mm() since new VMA has the same vma_start_pgoff()
/ vma_last_pgoff() and explicitly insert dst VMA after src one with
vma_interval_tree_insert_after().

But on move_vma() destination VMA can be merged into adjacent one and as
result shifted left in interval tree.  Fortunately, we can detect the
situation and prevent race with rmap walk by moving page table entries
under rmap lock.  See commit 38a76013ad80.

Problem is that we miss the lock when we move transhuge PMD.  Most
likely this bug caused the crash[1].

[1] http://thread.gmane.org/gmane.linux.kernel.mm/96473

Fixes: 108d6642ad81 ("mm anon rmap: remove anon_vma_moveto_tail")

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;        [3.7+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's critical for split_huge_page() (and migration) to catch and freeze
all PMDs on rmap walk.  It gets tricky if there's concurrent fork() or
mremap() since usually we copy/move page table entries on dup_mm() or
move_page_tables() without rmap lock taken.  To get it work we rely on
rmap walk order to not miss any entry.  We expect to see destination VMA
after source one to work correctly.

But after switching rmap implementation to interval tree it's not always
possible to preserve expected walk order.

It works fine for dup_mm() since new VMA has the same vma_start_pgoff()
/ vma_last_pgoff() and explicitly insert dst VMA after src one with
vma_interval_tree_insert_after().

But on move_vma() destination VMA can be merged into adjacent one and as
result shifted left in interval tree.  Fortunately, we can detect the
situation and prevent race with rmap walk by moving page table entries
under rmap lock.  See commit 38a76013ad80.

Problem is that we miss the lock when we move transhuge PMD.  Most
likely this bug caused the crash[1].

[1] http://thread.gmane.org/gmane.linux.kernel.mm/96473

Fixes: 108d6642ad81 ("mm anon rmap: remove anon_vma_moveto_tail")

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;        [3.7+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: postpone the disabling of kmemleak early logging</title>
<updated>2014-05-11T08:55:48+00:00</updated>
<author>
<name>Catalin Marinas</name>
<email>catalin.marinas@arm.com</email>
</author>
<published>2014-05-09T22:36:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=3551a9280bcb728980a13783ff295e9f0bdedd9a'/>
<id>3551a9280bcb728980a13783ff295e9f0bdedd9a</id>
<content type='text'>
Commit 8910ae896c8c ("kmemleak: change some global variables to int"),
in addition to the atomic -&gt; int conversion, moved the disabling of
kmemleak_early_log to the beginning of the kmemleak_init() function,
before the full kmemleak tracing is actually enabled.  In this small
window, kmem_cache_create() is called by kmemleak which triggers
additional memory allocation that are not traced.  This patch restores
the original logic with kmemleak_early_log disabling when kmemleak is
fully functional.

Fixes: 8910ae896c8c (kmemleak: change some global variables to int)

Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 8910ae896c8c ("kmemleak: change some global variables to int"),
in addition to the atomic -&gt; int conversion, moved the disabling of
kmemleak_early_log to the beginning of the kmemleak_init() function,
before the full kmemleak tracing is actually enabled.  In this small
window, kmem_cache_create() is called by kmemleak which triggers
additional memory allocation that are not traced.  This patch restores
the original logic with kmemleak_early_log disabling when kmemleak is
fully functional.

Fixes: 8910ae896c8c (kmemleak: change some global variables to int)

Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (incoming from Andrew)</title>
<updated>2014-05-06T20:07:41+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-05-06T20:07:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=38583f095c5a8138ae2a1c9173d0fd8a9f10e8aa'/>
<id>38583f095c5a8138ae2a1c9173d0fd8a9f10e8aa</id>
<content type='text'>
Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;:
  agp: info leak in agpioc_info_wrap()
  fs/affs/super.c: bugfix / double free
  fanotify: fix -EOVERFLOW with large files on 64-bit
  slub: use sysfs'es release mechanism for kmem_cache
  revert "mm: vmscan: do not swap anon pages just because free+file is low"
  autofs: fix lockref lookup
  mm: filemap: update find_get_pages_tag() to deal with shadow entries
  mm/compaction: make isolate_freepages start at pageblock boundary
  MAINTAINERS: zswap/zbud: change maintainer email address
  mm/page-writeback.c: fix divide by zero in pos_ratio_polynom
  hugetlb: ensure hugepage access is denied if hugepages are not supported
  slub: fix memcg_propagate_slab_attrs
  drivers/rtc/rtc-pcf8523.c: fix month definition
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;:
  agp: info leak in agpioc_info_wrap()
  fs/affs/super.c: bugfix / double free
  fanotify: fix -EOVERFLOW with large files on 64-bit
  slub: use sysfs'es release mechanism for kmem_cache
  revert "mm: vmscan: do not swap anon pages just because free+file is low"
  autofs: fix lockref lookup
  mm: filemap: update find_get_pages_tag() to deal with shadow entries
  mm/compaction: make isolate_freepages start at pageblock boundary
  MAINTAINERS: zswap/zbud: change maintainer email address
  mm/page-writeback.c: fix divide by zero in pos_ratio_polynom
  hugetlb: ensure hugepage access is denied if hugepages are not supported
  slub: fix memcg_propagate_slab_attrs
  drivers/rtc/rtc-pcf8523.c: fix month definition
</pre>
</div>
</content>
</entry>
<entry>
<title>slub: use sysfs'es release mechanism for kmem_cache</title>
<updated>2014-05-06T20:04:59+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux.com</email>
</author>
<published>2014-05-06T19:50:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=41a212859a4dd583d3aa032cdd3efa564c4f189f'/>
<id>41a212859a4dd583d3aa032cdd3efa564c4f189f</id>
<content type='text'>
debugobjects warning during netfilter exit:

    ------------[ cut here ]------------
    WARNING: CPU: 6 PID: 4178 at lib/debugobjects.c:260 debug_print_object+0x8d/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 6 PID: 4178 Comm: kworker/u16:2 Tainted: G        W 3.11.0-next-20130906-sasha #3984
    Workqueue: netns cleanup_net
    Call Trace:
      dump_stack+0x52/0x87
      warn_slowpath_common+0x8c/0xc0
      warn_slowpath_fmt+0x46/0x50
      debug_print_object+0x8d/0xb0
      __debug_check_no_obj_freed+0xa5/0x220
      debug_check_no_obj_freed+0x15/0x20
      kmem_cache_free+0x197/0x340
      kmem_cache_destroy+0x86/0xe0
      nf_conntrack_cleanup_net_list+0x131/0x170
      nf_conntrack_pernet_exit+0x5d/0x70
      ops_exit_list+0x5e/0x70
      cleanup_net+0xfb/0x1c0
      process_one_work+0x338/0x550
      worker_thread+0x215/0x350
      kthread+0xe7/0xf0
      ret_from_fork+0x7c/0xb0

Also during dcookie cleanup:

    WARNING: CPU: 12 PID: 9725 at lib/debugobjects.c:260 debug_print_object+0x8c/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 12 PID: 9725 Comm: trinity-c141 Not tainted 3.15.0-rc2-next-20140423-sasha-00018-gc4ff6c4 #408
    Call Trace:
      dump_stack (lib/dump_stack.c:52)
      warn_slowpath_common (kernel/panic.c:430)
      warn_slowpath_fmt (kernel/panic.c:445)
      debug_print_object (lib/debugobjects.c:262)
      __debug_check_no_obj_freed (lib/debugobjects.c:697)
      debug_check_no_obj_freed (lib/debugobjects.c:726)
      kmem_cache_free (mm/slub.c:2689 mm/slub.c:2717)
      kmem_cache_destroy (mm/slab_common.c:363)
      dcookie_unregister (fs/dcookies.c:302 fs/dcookies.c:343)
      event_buffer_release (arch/x86/oprofile/../../../drivers/oprofile/event_buffer.c:153)
      __fput (fs/file_table.c:217)
      ____fput (fs/file_table.c:253)
      task_work_run (kernel/task_work.c:125 (discriminator 1))
      do_notify_resume (include/linux/tracehook.h:196 arch/x86/kernel/signal.c:751)
      int_signal (arch/x86/kernel/entry_64.S:807)

Sysfs has a release mechanism.  Use that to release the kmem_cache
structure if CONFIG_SYSFS is enabled.

Only slub is changed - slab currently only supports /proc/slabinfo and
not /sys/kernel/slab/*.  We talked about adding that and someone was
working on it.

[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more]
Signed-off-by: Christoph Lameter &lt;cl@linux.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Tested-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Acked-by: Greg KH &lt;greg@kroah.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
debugobjects warning during netfilter exit:

    ------------[ cut here ]------------
    WARNING: CPU: 6 PID: 4178 at lib/debugobjects.c:260 debug_print_object+0x8d/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 6 PID: 4178 Comm: kworker/u16:2 Tainted: G        W 3.11.0-next-20130906-sasha #3984
    Workqueue: netns cleanup_net
    Call Trace:
      dump_stack+0x52/0x87
      warn_slowpath_common+0x8c/0xc0
      warn_slowpath_fmt+0x46/0x50
      debug_print_object+0x8d/0xb0
      __debug_check_no_obj_freed+0xa5/0x220
      debug_check_no_obj_freed+0x15/0x20
      kmem_cache_free+0x197/0x340
      kmem_cache_destroy+0x86/0xe0
      nf_conntrack_cleanup_net_list+0x131/0x170
      nf_conntrack_pernet_exit+0x5d/0x70
      ops_exit_list+0x5e/0x70
      cleanup_net+0xfb/0x1c0
      process_one_work+0x338/0x550
      worker_thread+0x215/0x350
      kthread+0xe7/0xf0
      ret_from_fork+0x7c/0xb0

Also during dcookie cleanup:

    WARNING: CPU: 12 PID: 9725 at lib/debugobjects.c:260 debug_print_object+0x8c/0xb0()
    ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x20
    Modules linked in:
    CPU: 12 PID: 9725 Comm: trinity-c141 Not tainted 3.15.0-rc2-next-20140423-sasha-00018-gc4ff6c4 #408
    Call Trace:
      dump_stack (lib/dump_stack.c:52)
      warn_slowpath_common (kernel/panic.c:430)
      warn_slowpath_fmt (kernel/panic.c:445)
      debug_print_object (lib/debugobjects.c:262)
      __debug_check_no_obj_freed (lib/debugobjects.c:697)
      debug_check_no_obj_freed (lib/debugobjects.c:726)
      kmem_cache_free (mm/slub.c:2689 mm/slub.c:2717)
      kmem_cache_destroy (mm/slab_common.c:363)
      dcookie_unregister (fs/dcookies.c:302 fs/dcookies.c:343)
      event_buffer_release (arch/x86/oprofile/../../../drivers/oprofile/event_buffer.c:153)
      __fput (fs/file_table.c:217)
      ____fput (fs/file_table.c:253)
      task_work_run (kernel/task_work.c:125 (discriminator 1))
      do_notify_resume (include/linux/tracehook.h:196 arch/x86/kernel/signal.c:751)
      int_signal (arch/x86/kernel/entry_64.S:807)

Sysfs has a release mechanism.  Use that to release the kmem_cache
structure if CONFIG_SYSFS is enabled.

Only slub is changed - slab currently only supports /proc/slabinfo and
not /sys/kernel/slab/*.  We talked about adding that and someone was
working on it.

[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more]
Signed-off-by: Christoph Lameter &lt;cl@linux.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Tested-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Acked-by: Greg KH &lt;greg@kroah.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>revert "mm: vmscan: do not swap anon pages just because free+file is low"</title>
<updated>2014-05-06T20:04:59+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2014-05-06T19:50:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=623762517e2370be3b3f95f4fe08d6c063a49b06'/>
<id>623762517e2370be3b3f95f4fe08d6c063a49b06</id>
<content type='text'>
This reverts commit 0bf1457f0cfc ("mm: vmscan: do not swap anon pages
just because free+file is low") because it introduced a regression in
mostly-anonymous workloads, where reclaim would become ineffective and
trap every allocating task in direct reclaim.

The problem is that there is a runaway feedback loop in the scan balance
between file and anon, where the balance tips heavily towards a tiny
thrashing file LRU and anonymous pages are no longer being looked at.
The commit in question removed the safe guard that would detect such
situations and respond with forced anonymous reclaim.

This commit was part of a series to fix premature swapping in loads with
relatively little cache, and while it made a small difference, the cure
is obviously worse than the disease.  Revert it.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Acked-by: Rafael Aquini &lt;aquini@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: &lt;stable@kernel.org&gt;		[3.12+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 0bf1457f0cfc ("mm: vmscan: do not swap anon pages
just because free+file is low") because it introduced a regression in
mostly-anonymous workloads, where reclaim would become ineffective and
trap every allocating task in direct reclaim.

The problem is that there is a runaway feedback loop in the scan balance
between file and anon, where the balance tips heavily towards a tiny
thrashing file LRU and anonymous pages are no longer being looked at.
The commit in question removed the safe guard that would detect such
situations and respond with forced anonymous reclaim.

This commit was part of a series to fix premature swapping in loads with
relatively little cache, and while it made a small difference, the cure
is obviously worse than the disease.  Revert it.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Acked-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Acked-by: Rafael Aquini &lt;aquini@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: &lt;stable@kernel.org&gt;		[3.12+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: filemap: update find_get_pages_tag() to deal with shadow entries</title>
<updated>2014-05-06T20:04:59+00:00</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2014-05-06T19:50:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=139b6a6fb1539e04b01663d61baff3088c63dbb5'/>
<id>139b6a6fb1539e04b01663d61baff3088c63dbb5</id>
<content type='text'>
Dave Jones reports the following crash when find_get_pages_tag() runs
into an exceptional entry:

  kernel BUG at mm/filemap.c:1347!
  RIP: find_get_pages_tag+0x1cb/0x220
  Call Trace:
    find_get_pages_tag+0x36/0x220
    pagevec_lookup_tag+0x21/0x30
    filemap_fdatawait_range+0xbe/0x1e0
    filemap_fdatawait+0x27/0x30
    sync_inodes_sb+0x204/0x2a0
    sync_inodes_one_sb+0x19/0x20
    iterate_supers+0xb2/0x110
    sys_sync+0x44/0xb0
    ia32_do_call+0x13/0x13

  1343                         /*
  1344                          * This function is never used on a shmem/tmpfs
  1345                          * mapping, so a swap entry won't be found here.
  1346                          */
  1347                         BUG();

After commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in
page cache radix trees") this comment and BUG() are out of date because
exceptional entries can now appear in all mappings - as shadows of
recently evicted pages.

However, as Hugh Dickins notes,

  "it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
   any other PAGECACHE_TAG_*) to appear on an exceptional entry.

   I expect it comes down to an occasional race in RCU lookup of the
   radix_tree: lacking absolute synchronization, we might sometimes
   catch an exceptional entry, with the tag which really belongs with
   the unexceptional entry which was there an instant before."

And indeed, not only is the tree walk lockless, the tags are also read
in chunks, one radix tree node at a time.  There is plenty of time for
page reclaim to swoop in and replace a page that was already looked up
as tagged with a shadow entry.

Remove the BUG() and update the comment.  While reviewing all other
lookup sites for whether they properly deal with shadow entries of
evicted pages, update all the comments and fix memcg file charge moving
to not miss shmem/tmpfs swapcache pages.

Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dave Jones reports the following crash when find_get_pages_tag() runs
into an exceptional entry:

  kernel BUG at mm/filemap.c:1347!
  RIP: find_get_pages_tag+0x1cb/0x220
  Call Trace:
    find_get_pages_tag+0x36/0x220
    pagevec_lookup_tag+0x21/0x30
    filemap_fdatawait_range+0xbe/0x1e0
    filemap_fdatawait+0x27/0x30
    sync_inodes_sb+0x204/0x2a0
    sync_inodes_one_sb+0x19/0x20
    iterate_supers+0xb2/0x110
    sys_sync+0x44/0xb0
    ia32_do_call+0x13/0x13

  1343                         /*
  1344                          * This function is never used on a shmem/tmpfs
  1345                          * mapping, so a swap entry won't be found here.
  1346                          */
  1347                         BUG();

After commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in
page cache radix trees") this comment and BUG() are out of date because
exceptional entries can now appear in all mappings - as shadows of
recently evicted pages.

However, as Hugh Dickins notes,

  "it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
   any other PAGECACHE_TAG_*) to appear on an exceptional entry.

   I expect it comes down to an occasional race in RCU lookup of the
   radix_tree: lacking absolute synchronization, we might sometimes
   catch an exceptional entry, with the tag which really belongs with
   the unexceptional entry which was there an instant before."

And indeed, not only is the tree walk lockless, the tags are also read
in chunks, one radix tree node at a time.  There is plenty of time for
page reclaim to swoop in and replace a page that was already looked up
as tagged with a shadow entry.

Remove the BUG() and update the comment.  While reviewing all other
lookup sites for whether they properly deal with shadow entries of
evicted pages, update all the comments and fix memcg file charge moving
to not miss shmem/tmpfs swapcache pages.

Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/compaction: make isolate_freepages start at pageblock boundary</title>
<updated>2014-05-06T20:04:59+00:00</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2014-05-06T19:50:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=49e068f0b73dd042c186ffa9b420a9943e90389a'/>
<id>49e068f0b73dd042c186ffa9b420a9943e90389a</id>
<content type='text'>
The compaction freepage scanner implementation in isolate_freepages()
starts by taking the current cc-&gt;free_pfn value as the first pfn.  In a
for loop, it scans from this first pfn to the end of the pageblock, and
then subtracts pageblock_nr_pages from the first pfn to obtain the first
pfn for the next for loop iteration.

This means that when cc-&gt;free_pfn starts at offset X rather than being
aligned on pageblock boundary, the scanner will start at offset X in all
scanned pageblock, ignoring potentially many free pages.  Currently this
can happen when

 a) zone's end pfn is not pageblock aligned, or

 b) through zone-&gt;compact_cached_free_pfn with CONFIG_HOLES_IN_ZONE
    enabled and a hole spanning the beginning of a pageblock

This patch fixes the problem by aligning the initial pfn in
isolate_freepages() to pageblock boundary.  This also permits replacing
the end-of-pageblock alignment within the for loop with a simple
pageblock_nr_pages increment.

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reported-by: Heesub Shin &lt;heesub.shin@samsung.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Dongjun Shin &lt;d.j.shin@samsung.com&gt;
Cc: Sunghwan Yun &lt;sunghwan.yun@samsung.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The compaction freepage scanner implementation in isolate_freepages()
starts by taking the current cc-&gt;free_pfn value as the first pfn.  In a
for loop, it scans from this first pfn to the end of the pageblock, and
then subtracts pageblock_nr_pages from the first pfn to obtain the first
pfn for the next for loop iteration.

This means that when cc-&gt;free_pfn starts at offset X rather than being
aligned on pageblock boundary, the scanner will start at offset X in all
scanned pageblock, ignoring potentially many free pages.  Currently this
can happen when

 a) zone's end pfn is not pageblock aligned, or

 b) through zone-&gt;compact_cached_free_pfn with CONFIG_HOLES_IN_ZONE
    enabled and a hole spanning the beginning of a pageblock

This patch fixes the problem by aligning the initial pfn in
isolate_freepages() to pageblock boundary.  This also permits replacing
the end-of-pageblock alignment within the for loop with a simple
pageblock_nr_pages increment.

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reported-by: Heesub Shin &lt;heesub.shin@samsung.com&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Dongjun Shin &lt;d.j.shin@samsung.com&gt;
Cc: Sunghwan Yun &lt;sunghwan.yun@samsung.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
