<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/ipc/shm.c, branch v2.6.26-rc7</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>/proc/sysvipc/shm: fix 32-bit truncation of segment sizes</title>
<updated>2008-06-13T01:05:41+00:00</updated>
<author>
<name>Paul Menage</name>
<email>menage@google.com</email>
</author>
<published>2008-06-12T22:21:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=6c826818ff55eae7702b778b5f8bdf765af3b2af'/>
<id>6c826818ff55eae7702b778b5f8bdf765af3b2af</id>
<content type='text'>
sysvipc_shm_proc_show() picks between format strings (based on the
expected maximum length of a SHM segment) in a way that prevents gcc from
performing format checks on the seq_printf() parameters.  This hid two
format errors - shp-&gt;shm_segsz and shp-&gt;shm_nattach are both unsigned
long, but were being printed as unsigned int and signed int respectively.
This leads to 32-bit truncation of SHM segment sizes reported in
/proc/sysvipc/shm.  (And for nattach, but that's less of a problem for
most users).

This patch makes the format string directly visible to gcc's format
specifier checker, and fixes the two broken format specifiers.

Signed-off-by: Paul Menage &lt;menage@google.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Pierre Peiffer &lt;peifferp@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sysvipc_shm_proc_show() picks between format strings (based on the
expected maximum length of a SHM segment) in a way that prevents gcc from
performing format checks on the seq_printf() parameters.  This hid two
format errors - shp-&gt;shm_segsz and shp-&gt;shm_nattach are both unsigned
long, but were being printed as unsigned int and signed int respectively.
This leads to 32-bit truncation of SHM segment sizes reported in
/proc/sysvipc/shm.  (And for nattach, but that's less of a problem for
most users).

This patch makes the format string directly visible to gcc's format
specifier checker, and fixes the two broken format specifiers.

Signed-off-by: Paul Menage &lt;menage@google.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Pierre Peiffer &lt;peifferp@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>shm: Remove silly double assignment</title>
<updated>2008-06-10T14:58:00+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2008-06-10T12:53:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=c592713b3e124ce0719e6af4bc2520424c49cbae'/>
<id>c592713b3e124ce0719e6af4bc2520424c49cbae</id>
<content type='text'>
Found a silly double assignment of err is do_shmat.  Silly, but good to
clean up the useless code.

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Found a silly double assignment of err is do_shmat.  Silly, but good to
clean up the useless code.

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPC: consolidate all xxxctl_down() functions</title>
<updated>2008-04-29T15:06:14+00:00</updated>
<author>
<name>Pierre Peiffer</name>
<email>pierre.peiffer@bull.net</email>
</author>
<published>2008-04-29T08:00:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=a5f75e7f256f75759ec3d6dbef0ba932f1b397d2'/>
<id>a5f75e7f256f75759ec3d6dbef0ba932f1b397d2</id>
<content type='text'>
semctl_down(), msgctl_down() and shmctl_down() are used to handle the same set
of commands for each kind of IPC.  They all start to do the same job (they
retrieve the ipc and do some permission checks) before handling the commands
on their own.

This patch proposes to consolidate this by moving these same pieces of code
into one common function called ipcctl_pre_down().

It simplifies a little these xxxctl_down() functions and increases a little
the maintainability.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
semctl_down(), msgctl_down() and shmctl_down() are used to handle the same set
of commands for each kind of IPC.  They all start to do the same job (they
retrieve the ipc and do some permission checks) before handling the commands
on their own.

This patch proposes to consolidate this by moving these same pieces of code
into one common function called ipcctl_pre_down().

It simplifies a little these xxxctl_down() functions and increases a little
the maintainability.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPC: introduce ipc_update_perm()</title>
<updated>2008-04-29T15:06:13+00:00</updated>
<author>
<name>Pierre Peiffer</name>
<email>pierre.peiffer@bull.net</email>
</author>
<published>2008-04-29T08:00:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=8f4a3809c18ff3107bdbb1fabe3f4e5d2a928321'/>
<id>8f4a3809c18ff3107bdbb1fabe3f4e5d2a928321</id>
<content type='text'>
The IPC_SET command performs the same permission setting for all IPCs.  This
patch introduces a common ipc_update_perm() function to update these
permissions and makes use of it for all IPCs.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The IPC_SET command performs the same permission setting for all IPCs.  This
patch introduces a common ipc_update_perm() function to update these
permissions and makes use of it for all IPCs.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPC: get rid of the use *_setbuf structure.</title>
<updated>2008-04-29T15:06:13+00:00</updated>
<author>
<name>Pierre Peiffer</name>
<email>pierre.peiffer@bull.net</email>
</author>
<published>2008-04-29T08:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=016d7132f246a05e6e34ccba157fa278a96c45ae'/>
<id>016d7132f246a05e6e34ccba157fa278a96c45ae</id>
<content type='text'>
All IPCs make use of an intermetiate *_setbuf structure to handle the IPC_SET
command.  This is not really needed and, moreover, it complicates a little bit
the code.

This patch gets rid of the use of it and uses directly the semid64_ds/
msgid64_ds/shmid64_ds structure.

In addition of removing one struture declaration, it also simplifies and
improves a little bit the common 64-bits path.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All IPCs make use of an intermetiate *_setbuf structure to handle the IPC_SET
command.  This is not really needed and, moreover, it complicates a little bit
the code.

This patch gets rid of the use of it and uses directly the semid64_ds/
msgid64_ds/shmid64_ds structure.

In addition of removing one struture declaration, it also simplifies and
improves a little bit the common 64-bits path.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPC/shared memory: introduce shmctl_down</title>
<updated>2008-04-29T15:06:13+00:00</updated>
<author>
<name>Pierre Peiffer</name>
<email>pierre.peiffer@bull.net</email>
</author>
<published>2008-04-29T08:00:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=8d4cc8b5c5e5bac526618ee704f3cfdcad954e0c'/>
<id>8d4cc8b5c5e5bac526618ee704f3cfdcad954e0c</id>
<content type='text'>
Currently, the way the different commands are handled in sys_shmctl introduces
some duplicated code.

This patch introduces the shmctl_down function to handle all the commands
requiring the rwmutex to be taken in write mode (ie IPC_SET and IPC_RMID for
now).  It is the equivalent function of semctl_down for shared memory.

This removes some duplicated code for handling these both commands and
harmonizes the way they are handled among all IPCs.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, the way the different commands are handled in sys_shmctl introduces
some duplicated code.

This patch introduces the shmctl_down function to handle all the commands
requiring the rwmutex to be taken in write mode (ie IPC_SET and IPC_RMID for
now).  It is the equivalent function of semctl_down for shared memory.

This removes some duplicated code for handling these both commands and
harmonizes the way they are handled among all IPCs.

Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPC: use ipc_buildid() directly from ipc_addid()</title>
<updated>2008-04-29T15:06:12+00:00</updated>
<author>
<name>Pierre Peiffer</name>
<email>pierre.peiffer@bull.net</email>
</author>
<published>2008-04-29T08:00:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=48dea404ed01869313f1908cca8a15774dcd8ee5'/>
<id>48dea404ed01869313f1908cca8a15774dcd8ee5</id>
<content type='text'>
By continuing to consolidate a little the IPC code, each id can be built
directly in ipc_addid() instead of having it built from each callers of
ipc_addid()

And I also remove shm_addid() in order to have, as much as possible, the
same code for shm/sem/msg.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
By continuing to consolidate a little the IPC code, each id can be built
directly in ipc_addid() instead of having it built from each callers of
ipc_addid()

And I also remove shm_addid() in order to have, as much as possible, the
same code for shm/sem/msg.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pierre Peiffer &lt;pierre.peiffer@bull.net&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mempolicy: rework mempolicy Reference Counting [yet again]</title>
<updated>2008-04-28T15:58:24+00:00</updated>
<author>
<name>Lee Schermerhorn</name>
<email>lee.schermerhorn@hp.com</email>
</author>
<published>2008-04-28T09:13:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=52cd3b074050dd664380b5e8cfc85d4a6ed8ad48'/>
<id>52cd3b074050dd664380b5e8cfc85d4a6ed8ad48</id>
<content type='text'>
After further discussion with Christoph Lameter, it has become clear that my
earlier attempts to clean up the mempolicy reference counting were a bit of
overkill in some areas, resulting in superflous ref/unref in what are usually
fast paths.  In other areas, further inspection reveals that I botched the
unref for interleave policies.

A separate patch, suitable for upstream/stable trees, fixes up the known
errors in the previous attempt to fix reference counting.

This patch reworks the memory policy referencing counting and, one hopes,
simplifies the code.  Maybe I'll get it right this time.

See the update to the numa_memory_policy.txt document for a discussion of
memory policy reference counting that motivates this patch.

Summary:

Lookup of mempolicy, based on (vma, address) need only add a reference for
shared policy, and we need only unref the policy when finished for shared
policies.  So, this patch backs out all of the unneeded extra reference
counting added by my previous attempt.  It then unrefs only shared policies
when we're finished with them, using the mpol_cond_put() [conditional put]
helper function introduced by this patch.

Note that shmem_swapin() calls read_swap_cache_async() with a dummy vma
containing just the policy.  read_swap_cache_async() can call alloc_page_vma()
multiple times, so we can't let alloc_page_vma() unref the shared policy in
this case.  To avoid this, we make a copy of any non-null shared policy and
remove the MPOL_F_SHARED flag from the copy.  This copy occurs before reading
a page [or multiple pages] from swap, so the overhead should not be an issue
here.

I introduced a new static inline function "mpol_cond_copy()" to copy the
shared policy to an on-stack policy and remove the flags that would require a
conditional free.  The current implementation of mpol_cond_copy() assumes that
the struct mempolicy contains no pointers to dynamically allocated structures
that must be duplicated or reference counted during copy.

Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After further discussion with Christoph Lameter, it has become clear that my
earlier attempts to clean up the mempolicy reference counting were a bit of
overkill in some areas, resulting in superflous ref/unref in what are usually
fast paths.  In other areas, further inspection reveals that I botched the
unref for interleave policies.

A separate patch, suitable for upstream/stable trees, fixes up the known
errors in the previous attempt to fix reference counting.

This patch reworks the memory policy referencing counting and, one hopes,
simplifies the code.  Maybe I'll get it right this time.

See the update to the numa_memory_policy.txt document for a discussion of
memory policy reference counting that motivates this patch.

Summary:

Lookup of mempolicy, based on (vma, address) need only add a reference for
shared policy, and we need only unref the policy when finished for shared
policies.  So, this patch backs out all of the unneeded extra reference
counting added by my previous attempt.  It then unrefs only shared policies
when we're finished with them, using the mpol_cond_put() [conditional put]
helper function introduced by this patch.

Note that shmem_swapin() calls read_swap_cache_async() with a dummy vma
containing just the policy.  read_swap_cache_async() can call alloc_page_vma()
multiple times, so we can't let alloc_page_vma() unref the shared policy in
this case.  To avoid this, we make a copy of any non-null shared policy and
remove the MPOL_F_SHARED flag from the copy.  This copy occurs before reading
a page [or multiple pages] from swap, so the overhead should not be an issue
here.

I introduced a new static inline function "mpol_cond_copy()" to copy the
shared policy to an on-stack policy and remove the flags that would require a
conditional free.  The current implementation of mpol_cond_copy() assumes that
the struct mempolicy contains no pointers to dynamically allocated structures
that must be duplicated or reference counted during copy.

Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mempolicy: fixup Fallback for Default Shmem Policy</title>
<updated>2008-04-28T15:58:24+00:00</updated>
<author>
<name>Lee Schermerhorn</name>
<email>lee.schermerhorn@hp.com</email>
</author>
<published>2008-04-28T09:13:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=ae4d8c16aa22775f5731677abb8a82f03cec877e'/>
<id>ae4d8c16aa22775f5731677abb8a82f03cec877e</id>
<content type='text'>
get_vma_policy() is not handling fallback to task policy correctly when the
get_policy() vm_op returns NULL.  The NULL overwrites the 'pol' variable that
was holding the fallback task mempolicy.  So, it was falling back directly to
system default policy.

Fix get_vma_policy() to use only non-NULL policy returned from the vma
get_policy op.

shm_get_policy() was falling back to current task's mempolicy if the "backing
file system" [tmpfs vs hugetlbfs] does not support the get_policy vm_op and
the vma policy is null.  This is incorrect for show_numa_maps() which is
likely querying the numa_maps of some task other than current.  Remove this
fallback.

Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
get_vma_policy() is not handling fallback to task policy correctly when the
get_policy() vm_op returns NULL.  The NULL overwrites the 'pol' variable that
was holding the fallback task mempolicy.  So, it was falling back directly to
system default policy.

Fix get_vma_policy() to use only non-NULL policy returned from the vma
get_policy op.

shm_get_policy() was falling back to current task's mempolicy if the "backing
file system" [tmpfs vs hugetlbfs] does not support the get_policy vm_op and
the vma policy is null.  This is incorrect for show_numa_maps() which is
likely querying the numa_maps of some task other than current.  Remove this
fallback.

Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mempolicy: fix reference counting bugs</title>
<updated>2008-03-11T01:01:19+00:00</updated>
<author>
<name>Lee Schermerhorn</name>
<email>Lee.Schermerhorn@hp.com</email>
</author>
<published>2008-03-10T18:43:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=69682d852f5c94ee94e21174b3e8b719626c98db'/>
<id>69682d852f5c94ee94e21174b3e8b719626c98db</id>
<content type='text'>
Address 3 known bugs in the current memory policy reference counting method.
I have a series of patches to rework the reference counting to reduce overhead
in the allocation path.  However, that series will require testing in -mm once
I repost it.

1) alloc_page_vma() does not release the extra reference taken for
   vma/shared mempolicy when the mode == MPOL_INTERLEAVE.  This can result in
   leaking mempolicy structures.  This is probably occurring, but not being
   noticed.

   Fix:  add the conditional release of the reference.

2) hugezonelist unconditionally releases a reference on the mempolicy when
   mode == MPOL_INTERLEAVE.  This can result in decrementing the reference
   count for system default policy [should have no ill effect] or premature
   freeing of task policy.  If this occurred, the next allocation using task
   mempolicy would use the freed structure and probably BUG out.

   Fix:  add the necessary check to the release.

3) The current reference counting method assumes that vma 'get_policy()'
   methods automatically add an extra reference a non-NULL returned mempolicy.
    This is true for shmem_get_policy() used by tmpfs mappings, including
   regular page shm segments.  However, SHM_HUGETLB shm's, backed by
   hugetlbfs, just use the vma policy without the extra reference.  This
   results in freeing of the vma policy on the first allocation, with reuse of
   the freed mempolicy structure on subsequent allocations.

   Fix: Rather than add another condition to the conditional reference
   release, which occur in the allocation path, just add a reference when
   returning the vma policy in shm_get_policy() to match the assumptions.

Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: Greg KH &lt;greg@kroah.com&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: &lt;eric.whitney@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Address 3 known bugs in the current memory policy reference counting method.
I have a series of patches to rework the reference counting to reduce overhead
in the allocation path.  However, that series will require testing in -mm once
I repost it.

1) alloc_page_vma() does not release the extra reference taken for
   vma/shared mempolicy when the mode == MPOL_INTERLEAVE.  This can result in
   leaking mempolicy structures.  This is probably occurring, but not being
   noticed.

   Fix:  add the conditional release of the reference.

2) hugezonelist unconditionally releases a reference on the mempolicy when
   mode == MPOL_INTERLEAVE.  This can result in decrementing the reference
   count for system default policy [should have no ill effect] or premature
   freeing of task policy.  If this occurred, the next allocation using task
   mempolicy would use the freed structure and probably BUG out.

   Fix:  add the necessary check to the release.

3) The current reference counting method assumes that vma 'get_policy()'
   methods automatically add an extra reference a non-NULL returned mempolicy.
    This is true for shmem_get_policy() used by tmpfs mappings, including
   regular page shm segments.  However, SHM_HUGETLB shm's, backed by
   hugetlbfs, just use the vma policy without the extra reference.  This
   results in freeing of the vma policy on the first allocation, with reuse of
   the freed mempolicy structure on subsequent allocations.

   Fix: Rather than add another condition to the conditional reference
   release, which occur in the allocation path, just add a reference when
   returning the vma policy in shm_get_policy() to match the assumptions.

Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: Greg KH &lt;greg@kroah.com&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: &lt;eric.whitney@hp.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
