<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/Documentation/features/vm, branch v6.12.80</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>x86: remove PG_uncached</title>
<updated>2024-09-04T04:15:46+00:00</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2024-08-21T19:34:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7a87225ae2c6c317c7b80cf599e5cf0eee699196'/>
<id>7a87225ae2c6c317c7b80cf599e5cf0eee699196</id>
<content type='text'>
Convert x86 to use PG_arch_2 instead of PG_uncached and remove
PG_uncached.

Link: https://lkml.kernel.org/r/20240821193445.2294269-11-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert x86 to use PG_arch_2 instead of PG_uncached and remove
PG_uncached.

Link: https://lkml.kernel.org/r/20240821193445.2294269-11-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: Add support for BATCHED_UNMAP_TLB_FLUSH</title>
<updated>2024-01-11T16:01:53+00:00</updated>
<author>
<name>Alexandre Ghiti</name>
<email>alexghiti@rivosinc.com</email>
</author>
<published>2024-01-08T19:36:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=54d7431af73e2fa53b73cfeb2bec559c6664a4e4'/>
<id>54d7431af73e2fa53b73cfeb2bec559c6664a4e4</id>
<content type='text'>
Allow to defer the flushing of the TLB when unmapping pages, which allows
to reduce the numbers of IPI and the number of sfence.vma.

The ubenchmarch used in commit 43b3dfdd0455 ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration") that
was multithreaded to force the usage of IPI shows good performance
improvement on all platforms:

* Unmatched: ~34%
* TH1520   : ~78%
* Qemu     : ~81%

In addition, perf on qemu reports an important decrease in time spent
dealing with IPIs:

Before:  68.17%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call
After :   8.64%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call

* Benchmark:

int stick_this_thread_to_core(int core_id) {
        int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
        if (core_id &lt; 0 || core_id &gt;= num_cores)
           return EINVAL;

        cpu_set_t cpuset;
        CPU_ZERO(&amp;cpuset);
        CPU_SET(core_id, &amp;cpuset);

        pthread_t current_thread = pthread_self();
        return pthread_setaffinity_np(current_thread,
sizeof(cpu_set_t), &amp;cpuset);
}

static void *fn_thread (void *p_data)
{
        int ret;
        pthread_t thread;

        stick_this_thread_to_core((int)p_data);

        while (1) {
                sleep(1);
        }

        return NULL;
}

int main()
{
        volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        pthread_t threads[4];
        int ret;

        for (int i = 0; i &lt; 4; ++i) {
                ret = pthread_create(&amp;threads[i], NULL, fn_thread, (void *)i);
                if (ret)
                {
                        printf("%s", strerror (ret));
                }
        }

        memset(p, 0x88, SIZE);

        for (int k = 0; k &lt; 10000; k++) {
                /* swap in */
                for (int i = 0; i &lt; SIZE; i += 4096) {
                        (void)p[i];
                }

                /* swap out */
                madvise(p, SIZE, MADV_PAGEOUT);
        }

        for (int i = 0; i &lt; 4; i++)
        {
                pthread_cancel(threads[i]);
        }

        for (int i = 0; i &lt; 4; i++)
        {
                pthread_join(threads[i], NULL);
        }

        return 0;
}

Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Reviewed-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Tested-by: Jisheng Zhang &lt;jszhang@kernel.org&gt; # Tested on TH1520
Tested-by: Nam Cao &lt;namcao@linutronix.de&gt;
Link: https://lore.kernel.org/r/20240108193640.344929-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow to defer the flushing of the TLB when unmapping pages, which allows
to reduce the numbers of IPI and the number of sfence.vma.

The ubenchmarch used in commit 43b3dfdd0455 ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration") that
was multithreaded to force the usage of IPI shows good performance
improvement on all platforms:

* Unmatched: ~34%
* TH1520   : ~78%
* Qemu     : ~81%

In addition, perf on qemu reports an important decrease in time spent
dealing with IPIs:

Before:  68.17%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call
After :   8.64%  main     [kernel.kallsyms]            [k] __sbi_rfence_v02_call

* Benchmark:

int stick_this_thread_to_core(int core_id) {
        int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
        if (core_id &lt; 0 || core_id &gt;= num_cores)
           return EINVAL;

        cpu_set_t cpuset;
        CPU_ZERO(&amp;cpuset);
        CPU_SET(core_id, &amp;cpuset);

        pthread_t current_thread = pthread_self();
        return pthread_setaffinity_np(current_thread,
sizeof(cpu_set_t), &amp;cpuset);
}

static void *fn_thread (void *p_data)
{
        int ret;
        pthread_t thread;

        stick_this_thread_to_core((int)p_data);

        while (1) {
                sleep(1);
        }

        return NULL;
}

int main()
{
        volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                                         MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        pthread_t threads[4];
        int ret;

        for (int i = 0; i &lt; 4; ++i) {
                ret = pthread_create(&amp;threads[i], NULL, fn_thread, (void *)i);
                if (ret)
                {
                        printf("%s", strerror (ret));
                }
        }

        memset(p, 0x88, SIZE);

        for (int k = 0; k &lt; 10000; k++) {
                /* swap in */
                for (int i = 0; i &lt; SIZE; i += 4096) {
                        (void)p[i];
                }

                /* swap out */
                madvise(p, SIZE, MADV_PAGEOUT);
        }

        for (int i = 0; i &lt; 4; i++)
        {
                pthread_cancel(threads[i]);
        }

        for (int i = 0; i &lt; 4; i++)
        {
                pthread_join(threads[i], NULL);
        }

        return 0;
}

Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Reviewed-by: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Tested-by: Jisheng Zhang &lt;jszhang@kernel.org&gt; # Tested on TH1520
Tested-by: Nam Cao &lt;namcao@linutronix.de&gt;
Link: https://lore.kernel.org/r/20240108193640.344929-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation: Drop IA64 from feature descriptions</title>
<updated>2023-09-11T08:13:18+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ardb@kernel.org</email>
</author>
<published>2023-01-13T17:07:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=af1f459233d4edeef634f559539e7f4b64cb1d25'/>
<id>af1f459233d4edeef634f559539e7f4b64cb1d25</id>
<content type='text'>
Itanium (IA64) is going away, so drop it from the kernel feature
documentation.

Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Itanium (IA64) is going away, so drop it from the kernel feature
documentation.

Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'docs-6.6' of git://git.lwn.net/linux</title>
<updated>2023-08-31T03:05:42+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-31T03:05:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=cd99b9eb4b702563c5ac7d26b632a628f5a832a5'/>
<id>cd99b9eb4b702563c5ac7d26b632a628f5a832a5</id>
<content type='text'>
Pull documentation updates from Jonathan Corbet:
 "Documentation work keeps chugging along; this includes:

   - Work from Carlos Bilbao to integrate rustdoc output into the
     generated HTML documentation. This took some work to figure out how
     to do it without slowing the docs build and without creating people
     who don't have Rust installed, but Carlos got there

   - Move the loongarch and mips architecture documentation under
     Documentation/arch/

   - Some more maintainer documentation from Jakub

  ... plus the usual assortment of updates, translations, and fixes"

* tag 'docs-6.6' of git://git.lwn.net/linux: (56 commits)
  Docu: genericirq.rst: fix irq-example
  input: docs: pxrc: remove reference to phoenix-sim
  Documentation: serial-console: Fix literal block marker
  docs/mm: remove references to hmm_mirror ops and clean typos
  docs/zh_CN: correct regi_chg(),regi_add() to region_chg(),region_add()
  Documentation: Fix typos
  Documentation/ABI: Fix typos
  scripts: kernel-doc: fix macro handling in enums
  scripts: kernel-doc: parse DEFINE_DMA_UNMAP_[ADDR|LEN]
  Documentation: riscv: Update boot image header since EFI stub is supported
  Documentation: riscv: Add early boot document
  Documentation: arm: Add bootargs to the table of added DT parameters
  docs: kernel-parameters: Refer to the correct bitmap function
  doc: update params of memhp_default_state=
  docs: Add book to process/kernel-docs.rst
  docs: sparse: fix invalid link addresses
  docs: vfs: clean up after the iterate() removal
  docs: Add a section on surveys to the researcher guidelines
  docs: move mips under arch
  docs: move loongarch under arch
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull documentation updates from Jonathan Corbet:
 "Documentation work keeps chugging along; this includes:

   - Work from Carlos Bilbao to integrate rustdoc output into the
     generated HTML documentation. This took some work to figure out how
     to do it without slowing the docs build and without creating people
     who don't have Rust installed, but Carlos got there

   - Move the loongarch and mips architecture documentation under
     Documentation/arch/

   - Some more maintainer documentation from Jakub

  ... plus the usual assortment of updates, translations, and fixes"

* tag 'docs-6.6' of git://git.lwn.net/linux: (56 commits)
  Docu: genericirq.rst: fix irq-example
  input: docs: pxrc: remove reference to phoenix-sim
  Documentation: serial-console: Fix literal block marker
  docs/mm: remove references to hmm_mirror ops and clean typos
  docs/zh_CN: correct regi_chg(),regi_add() to region_chg(),region_add()
  Documentation: Fix typos
  Documentation/ABI: Fix typos
  scripts: kernel-doc: fix macro handling in enums
  scripts: kernel-doc: parse DEFINE_DMA_UNMAP_[ADDR|LEN]
  Documentation: riscv: Update boot image header since EFI stub is supported
  Documentation: riscv: Add early boot document
  Documentation: arm: Add bootargs to the table of added DT parameters
  docs: kernel-parameters: Refer to the correct bitmap function
  doc: update params of memhp_default_state=
  docs: Add book to process/kernel-docs.rst
  docs: sparse: fix invalid link addresses
  docs: vfs: clean up after the iterate() removal
  docs: Add a section on surveys to the researcher guidelines
  docs: move mips under arch
  docs: move loongarch under arch
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: support batched/deferred tlb shootdown during page reclamation/migration</title>
<updated>2023-08-18T17:12:37+00:00</updated>
<author>
<name>Barry Song</name>
<email>v-songbaohua@oppo.com</email>
</author>
<published>2023-07-17T13:10:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=43b3dfdd04553171488cb11d46d21948b6b90e27'/>
<id>43b3dfdd04553171488cb11d46d21948b6b90e27</id>
<content type='text'>
On x86, batched and deferred tlb shootdown has lead to 90% performance
increase on tlb shootdown.  on arm64, HW can do tlb shootdown without
software IPI.  But sync tlbi is still quite expensive.

Even running a simplest program which requires swapout can
prove this is true,
 #include &lt;sys/types.h&gt;
 #include &lt;unistd.h&gt;
 #include &lt;sys/mman.h&gt;
 #include &lt;string.h&gt;

 int main()
 {
 #define SIZE (1 * 1024 * 1024)
         volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                                          MAP_SHARED | MAP_ANONYMOUS, -1, 0);

         memset(p, 0x88, SIZE);

         for (int k = 0; k &lt; 10000; k++) {
                 /* swap in */
                 for (int i = 0; i &lt; SIZE; i += 4096) {
                         (void)p[i];
                 }

                 /* swap out */
                 madvise(p, SIZE, MADV_PAGEOUT);
         }
 }

Perf result on snapdragon 888 with 8 cores by using zRAM
as the swap block device.

 ~ # perf record taskset -c 4 ./a.out
 [ perf record: Woken up 10 times to write data ]
 [ perf record: Captured and wrote 2.297 MB perf.data (60084 samples) ]
 ~ # perf report
 # To display the perf.data header info, please use --header/--header-only options.
 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 60K of event 'cycles'
 # Event count (approx.): 35706225414
 #
 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ......
 #
    21.07%  a.out    [kernel.kallsyms]  [k] _raw_spin_unlock_irq
     8.23%  a.out    [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     6.67%  a.out    [kernel.kallsyms]  [k] filemap_map_pages
     6.16%  a.out    [kernel.kallsyms]  [k] __zram_bvec_write
     5.36%  a.out    [kernel.kallsyms]  [k] ptep_clear_flush
     3.71%  a.out    [kernel.kallsyms]  [k] _raw_spin_lock
     3.49%  a.out    [kernel.kallsyms]  [k] memset64
     1.63%  a.out    [kernel.kallsyms]  [k] clear_page
     1.42%  a.out    [kernel.kallsyms]  [k] _raw_spin_unlock
     1.26%  a.out    [kernel.kallsyms]  [k] mod_zone_state.llvm.8525150236079521930
     1.23%  a.out    [kernel.kallsyms]  [k] xas_load
     1.15%  a.out    [kernel.kallsyms]  [k] zram_slot_lock

ptep_clear_flush() takes 5.36% CPU in the micro-benchmark swapping in/out
a page mapped by only one process.  If the page is mapped by multiple
processes, typically, like more than 100 on a phone, the overhead would be
much higher as we have to run tlb flush 100 times for one single page. 
Plus, tlb flush overhead will increase with the number of CPU cores due to
the bad scalability of tlb shootdown in HW, so those ARM64 servers should
expect much higher overhead.

Further perf annonate shows 95% cpu time of ptep_clear_flush is actually
used by the final dsb() to wait for the completion of tlb flush.  This
provides us a very good chance to leverage the existing batched tlb in
kernel.  The minimum modification is that we only send async tlbi in the
first stage and we send dsb while we have to sync in the second stage.

With the above simplest micro benchmark, collapsed time to finish the
program decreases around 5%.

Typical collapsed time w/o patch:
 ~ # time taskset -c 4 ./a.out
 0.21user 14.34system 0:14.69elapsed
w/ patch:
 ~ # time taskset -c 4 ./a.out
 0.22user 13.45system 0:13.80elapsed

Also tested with benchmark in the commit on Kunpeng920 arm64 server
and observed an improvement around 12.5% with command
`time ./swap_bench`.
        w/o             w/
real    0m13.460s       0m11.771s
user    0m0.248s        0m0.279s
sys     0m12.039s       0m11.458s

Originally it's noticed a 16.99% overhead of ptep_clear_flush()
which has been eliminated by this patch:

[root@localhost yang]# perf record -- ./swap_bench &amp;&amp; perf report
[...]
16.99%  swap_bench  [kernel.kallsyms]  [k] ptep_clear_flush

It is tested on 4,8,128 CPU platforms and shows to be beneficial on
large systems but may not have improvement on small systems like on
a 4 CPU platform.

Also this patch improve the performance of page migration. Using pmbench
and tries to migrate the pages of pmbench between node 0 and node 1 for
100 times for 1G memory, this patch decrease the time used around 20%
(prev 18.338318910 sec after 13.981866350 sec) and saved the time used
by ptep_clear_flush().

Link: https://lkml.kernel.org/r/20230717131004.12662-5-yangyicong@huawei.com
Tested-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Punit Agrawal &lt;punit.agrawal@bytedance.com&gt;
Signed-off-by: Barry Song &lt;v-songbaohua@oppo.com&gt;
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Reviewed-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Reviewed-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Reviewed-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Darren Hart &lt;darren@os.amperecomputing.com&gt;
Cc: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: lipeifeng &lt;lipeifeng@oppo.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Steven Miao &lt;realmz6@gmail.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Zeng Tao &lt;prime.zeng@hisilicon.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On x86, batched and deferred tlb shootdown has lead to 90% performance
increase on tlb shootdown.  on arm64, HW can do tlb shootdown without
software IPI.  But sync tlbi is still quite expensive.

Even running a simplest program which requires swapout can
prove this is true,
 #include &lt;sys/types.h&gt;
 #include &lt;unistd.h&gt;
 #include &lt;sys/mman.h&gt;
 #include &lt;string.h&gt;

 int main()
 {
 #define SIZE (1 * 1024 * 1024)
         volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                                          MAP_SHARED | MAP_ANONYMOUS, -1, 0);

         memset(p, 0x88, SIZE);

         for (int k = 0; k &lt; 10000; k++) {
                 /* swap in */
                 for (int i = 0; i &lt; SIZE; i += 4096) {
                         (void)p[i];
                 }

                 /* swap out */
                 madvise(p, SIZE, MADV_PAGEOUT);
         }
 }

Perf result on snapdragon 888 with 8 cores by using zRAM
as the swap block device.

 ~ # perf record taskset -c 4 ./a.out
 [ perf record: Woken up 10 times to write data ]
 [ perf record: Captured and wrote 2.297 MB perf.data (60084 samples) ]
 ~ # perf report
 # To display the perf.data header info, please use --header/--header-only options.
 # To display the perf.data header info, please use --header/--header-only options.
 #
 #
 # Total Lost Samples: 0
 #
 # Samples: 60K of event 'cycles'
 # Event count (approx.): 35706225414
 #
 # Overhead  Command  Shared Object      Symbol
 # ........  .......  .................  ......
 #
    21.07%  a.out    [kernel.kallsyms]  [k] _raw_spin_unlock_irq
     8.23%  a.out    [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     6.67%  a.out    [kernel.kallsyms]  [k] filemap_map_pages
     6.16%  a.out    [kernel.kallsyms]  [k] __zram_bvec_write
     5.36%  a.out    [kernel.kallsyms]  [k] ptep_clear_flush
     3.71%  a.out    [kernel.kallsyms]  [k] _raw_spin_lock
     3.49%  a.out    [kernel.kallsyms]  [k] memset64
     1.63%  a.out    [kernel.kallsyms]  [k] clear_page
     1.42%  a.out    [kernel.kallsyms]  [k] _raw_spin_unlock
     1.26%  a.out    [kernel.kallsyms]  [k] mod_zone_state.llvm.8525150236079521930
     1.23%  a.out    [kernel.kallsyms]  [k] xas_load
     1.15%  a.out    [kernel.kallsyms]  [k] zram_slot_lock

ptep_clear_flush() takes 5.36% CPU in the micro-benchmark swapping in/out
a page mapped by only one process.  If the page is mapped by multiple
processes, typically, like more than 100 on a phone, the overhead would be
much higher as we have to run tlb flush 100 times for one single page. 
Plus, tlb flush overhead will increase with the number of CPU cores due to
the bad scalability of tlb shootdown in HW, so those ARM64 servers should
expect much higher overhead.

Further perf annonate shows 95% cpu time of ptep_clear_flush is actually
used by the final dsb() to wait for the completion of tlb flush.  This
provides us a very good chance to leverage the existing batched tlb in
kernel.  The minimum modification is that we only send async tlbi in the
first stage and we send dsb while we have to sync in the second stage.

With the above simplest micro benchmark, collapsed time to finish the
program decreases around 5%.

Typical collapsed time w/o patch:
 ~ # time taskset -c 4 ./a.out
 0.21user 14.34system 0:14.69elapsed
w/ patch:
 ~ # time taskset -c 4 ./a.out
 0.22user 13.45system 0:13.80elapsed

Also tested with benchmark in the commit on Kunpeng920 arm64 server
and observed an improvement around 12.5% with command
`time ./swap_bench`.
        w/o             w/
real    0m13.460s       0m11.771s
user    0m0.248s        0m0.279s
sys     0m12.039s       0m11.458s

Originally it's noticed a 16.99% overhead of ptep_clear_flush()
which has been eliminated by this patch:

[root@localhost yang]# perf record -- ./swap_bench &amp;&amp; perf report
[...]
16.99%  swap_bench  [kernel.kallsyms]  [k] ptep_clear_flush

It is tested on 4,8,128 CPU platforms and shows to be beneficial on
large systems but may not have improvement on small systems like on
a 4 CPU platform.

Also this patch improve the performance of page migration. Using pmbench
and tries to migrate the pages of pmbench between node 0 and node 1 for
100 times for 1G memory, this patch decrease the time used around 20%
(prev 18.338318910 sec after 13.981866350 sec) and saved the time used
by ptep_clear_flush().

Link: https://lkml.kernel.org/r/20230717131004.12662-5-yangyicong@huawei.com
Tested-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Punit Agrawal &lt;punit.agrawal@bytedance.com&gt;
Signed-off-by: Barry Song &lt;v-songbaohua@oppo.com&gt;
Signed-off-by: Yicong Yang &lt;yangyicong@hisilicon.com&gt;
Reviewed-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Reviewed-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Reviewed-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Nadav Amit &lt;namit@vmware.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Barry Song &lt;baohua@kernel.org&gt;
Cc: Darren Hart &lt;darren@os.amperecomputing.com&gt;
Cc: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: lipeifeng &lt;lipeifeng@oppo.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Steven Miao &lt;realmz6@gmail.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Zeng Tao &lt;prime.zeng@hisilicon.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation/features: Refresh support files for 6.5</title>
<updated>2023-07-14T18:42:49+00:00</updated>
<author>
<name>Tiezhu Yang</name>
<email>yangtiezhu@loongson.cn</email>
</author>
<published>2023-07-11T07:32:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=8379c1552bbda054885d2e4f3092c36adab717f1'/>
<id>8379c1552bbda054885d2e4f3092c36adab717f1</id>
<content type='text'>
Run the refresh script [1] to document the recent feature additions.

[1] Documentation/features/scripts/features-refresh.sh

Signed-off-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Link: https://lore.kernel.org/r/1689060720-4628-3-git-send-email-yangtiezhu@loongson.cn
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Run the refresh script [1] to document the recent feature additions.

[1] Documentation/features/scripts/features-refresh.sh

Signed-off-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Link: https://lore.kernel.org/r/1689060720-4628-3-git-send-email-yangtiezhu@loongson.cn
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation/features: Check ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT</title>
<updated>2023-07-14T18:42:49+00:00</updated>
<author>
<name>Tiezhu Yang</name>
<email>yangtiezhu@loongson.cn</email>
</author>
<published>2023-07-11T07:31:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=f4135fca44a6779488012e05067edaa5262950a0'/>
<id>f4135fca44a6779488012e05067edaa5262950a0</id>
<content type='text'>
ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
so add ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT as another Kconfig check
for ELF-ASLR feature, then the refresh script can be used to handle
this case for all archs.

Co-developed-by: Xi Ruoyao &lt;xry111@xry111.site&gt;
Signed-off-by: Xi Ruoyao &lt;xry111@xry111.site&gt;
Signed-off-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Link: https://lore.kernel.org/r/1689060720-4628-2-git-send-email-yangtiezhu@loongson.cn
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT selects ARCH_HAS_ELF_RANDOMIZE,
so add ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT as another Kconfig check
for ELF-ASLR feature, then the refresh script can be used to handle
this case for all archs.

Co-developed-by: Xi Ruoyao &lt;xry111@xry111.site&gt;
Signed-off-by: Xi Ruoyao &lt;xry111@xry111.site&gt;
Signed-off-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Link: https://lore.kernel.org/r/1689060720-4628-2-git-send-email-yangtiezhu@loongson.cn
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux</title>
<updated>2022-12-14T23:23:49+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-14T23:23:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=eb67d239f3aa1711afb0a42eab50459d9f3d672e'/>
<id>eb67d239f3aa1711afb0a42eab50459d9f3d672e</id>
<content type='text'>
Pull RISC-V updates from Palmer Dabbelt:

 - Support for the T-Head PMU via the perf subsystem

 - ftrace support for rv32

 - Support for non-volatile memory devices

 - Various fixes and cleanups

* tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  Documentation: RISC-V: patch-acceptance: s/implementor/implementer
  Documentation: RISC-V: Mention the UEFI Standards
  Documentation: RISC-V: Allow patches for non-standard behavior
  Documentation: RISC-V: Fix a typo in patch-acceptance
  riscv: Fixup compile error with !MMU
  riscv: Fix P4D_SHIFT definition for 3-level page table mode
  riscv: Apply a static assert to riscv_isa_ext_id
  RISC-V: Add some comments about the shadow and overflow stacks
  RISC-V: Align the shadow stack
  RISC-V: Ensure Zicbom has a valid block size
  RISC-V: Introduce riscv_isa_extension_check
  RISC-V: Improve use of isa2hwcap[]
  riscv: Don't duplicate _ALTERNATIVE_CFG* macros
  riscv: alternatives: Drop the underscores from the assembly macro names
  riscv: alternatives: Don't name unused macro parameters
  riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2
  riscv: mm: call best_map_size many times during linear-mapping
  riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
  riscv: Fix crash during early errata patching
  riscv: boot: add zstd support
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull RISC-V updates from Palmer Dabbelt:

 - Support for the T-Head PMU via the perf subsystem

 - ftrace support for rv32

 - Support for non-volatile memory devices

 - Various fixes and cleanups

* tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
  Documentation: RISC-V: patch-acceptance: s/implementor/implementer
  Documentation: RISC-V: Mention the UEFI Standards
  Documentation: RISC-V: Allow patches for non-standard behavior
  Documentation: RISC-V: Fix a typo in patch-acceptance
  riscv: Fixup compile error with !MMU
  riscv: Fix P4D_SHIFT definition for 3-level page table mode
  riscv: Apply a static assert to riscv_isa_ext_id
  RISC-V: Add some comments about the shadow and overflow stacks
  RISC-V: Align the shadow stack
  RISC-V: Ensure Zicbom has a valid block size
  RISC-V: Introduce riscv_isa_extension_check
  RISC-V: Improve use of isa2hwcap[]
  riscv: Don't duplicate _ALTERNATIVE_CFG* macros
  riscv: alternatives: Drop the underscores from the assembly macro names
  riscv: alternatives: Don't name unused macro parameters
  riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2
  riscv: mm: call best_map_size many times during linear-mapping
  riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
  riscv: Fix crash during early errata patching
  riscv: boot: add zstd support
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Documentation/features: Use loongarch instead of loong</title>
<updated>2022-12-05T09:50:12+00:00</updated>
<author>
<name>Tiezhu Yang</name>
<email>yangtiezhu@loongson.cn</email>
</author>
<published>2022-12-04T12:18:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=cc8c418b4fc09ed58ddd27b8e90ec797e9ca1e67'/>
<id>cc8c418b4fc09ed58ddd27b8e90ec797e9ca1e67</id>
<content type='text'>
The official arch name is LoongArch [1], we should use small letter
loongarch instead of loong in Documentation/features, just use the
features-refresh.sh to refresh all the related files.

[1] https://www.kernel.org/doc/html/latest/loongarch/index.html

Fixes: 5860800e8696 ("Documentation/features: Update the arch support status files")
Signed-off-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Link: https://lore.kernel.org/r/1670156327-9631-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The official arch name is LoongArch [1], we should use small letter
loongarch instead of loong in Documentation/features, just use the
features-refresh.sh to refresh all the related files.

[1] https://www.kernel.org/doc/html/latest/loongarch/index.html

Fixes: 5860800e8696 ("Documentation/features: Update the arch support status files")
Signed-off-by: Tiezhu Yang &lt;yangtiezhu@loongson.cn&gt;
Link: https://lore.kernel.org/r/1670156327-9631-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT</title>
<updated>2022-10-29T00:10:01+00:00</updated>
<author>
<name>Liu Shixin</name>
<email>liushixin2@huawei.com</email>
</author>
<published>2022-10-12T12:00:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=310f541a027b1d5dc68f44f176cde618e6ee9691'/>
<id>310f541a027b1d5dc68f44f176cde618e6ee9691</id>
<content type='text'>
This sets the HAVE_ARCH_HUGE_VMAP option, and defines the required page
table functions. With this feature, ioremap area will be mapped with
huge page granularity according to its actual size. This feature can be
disabled by kernel parameter "nohugeiomap".

Signed-off-by: Liu Shixin &lt;liushixin2@huawei.com&gt;
Reviewed-by: Björn Töpel &lt;bjorn@kernel.org&gt;
Tested-by: Björn Töpel &lt;bjorn@kernel.org&gt;
Link: https://lore.kernel.org/r/20221012120038.1034354-2-liushixin2@huawei.com
[Palmer: minor formatting]
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This sets the HAVE_ARCH_HUGE_VMAP option, and defines the required page
table functions. With this feature, ioremap area will be mapped with
huge page granularity according to its actual size. This feature can be
disabled by kernel parameter "nohugeiomap".

Signed-off-by: Liu Shixin &lt;liushixin2@huawei.com&gt;
Reviewed-by: Björn Töpel &lt;bjorn@kernel.org&gt;
Tested-by: Björn Töpel &lt;bjorn@kernel.org&gt;
Link: https://lore.kernel.org/r/20221012120038.1034354-2-liushixin2@huawei.com
[Palmer: minor formatting]
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
