<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/vhost, branch v3.15</title>
<subtitle>Clone of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git</subtitle>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/'/>
<entry>
<title>Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending</title>
<updated>2014-04-12T23:51:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-04-12T23:51:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=141eaccd018ef0476e94b180026d973db35460fd'/>
<id>141eaccd018ef0476e94b180026d973db35460fd</id>
<content type='text'>
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the target pending updates for v3.15-rc1.  Apologies in
  advance for waiting until the second to last day of the merge window
  to send these out.

  The highlights this round include:

   - iser-target support for T10 PI (DIF) offloads (Sagi + Or)
   - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
   - Pass in transport supported PI at session initialization (Sagi + MKP + nab)
   - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
   - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
   - Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
   - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)

  Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
  metadata into KVM guest have been left-out for now, as there where a
  few comments from MST + Paolo that where not able to be addressed in
  time for v3.15.  Please expect this feature for v3.16-rc1"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
  ib_srpt: Use correct ib_sg_dma primitives
  target/tcm_fc: Rename ft_tport_create to ft_tport_get
  target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
  target/tcm_fc: Rename structs and list members for clarity
  target/tcm_fc: Limit to 1 TPG per wwn
  target/tcm_fc: Don't export ft_lport_list
  target/tcm_fc: Fix use-after-free of ft_tpg
  target: Add check to prevent Abort Task from aborting itself
  target: Enable READ_STRIP emulation in target_complete_ok_work
  target/sbc: Add sbc_dif_read_strip software emulation
  target: Enable WRITE_INSERT emulation in target_execute_cmd
  target/sbc: Add sbc_dif_generate software emulation
  target/sbc: Only expose PI read_cap16 bits when supported by fabric
  target/spc: Only expose PI mode page bits when supported by fabric
  target/spc: Only expose PI inquiry bits when supported by fabric
  target: Pass in transport supported PI at session initialization
  target/iblock: Fix double bioset_integrity_free bug
  Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
  target/rd: T10-Dif: RAM disk is allocating more space than required.
  iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull SCSI target updates from Nicholas Bellinger:
 "Here are the target pending updates for v3.15-rc1.  Apologies in
  advance for waiting until the second to last day of the merge window
  to send these out.

  The highlights this round include:

   - iser-target support for T10 PI (DIF) offloads (Sagi + Or)
   - Fix Task Aborted Status (TAS) handling in target-core (Alex Leung)
   - Pass in transport supported PI at session initialization (Sagi + MKP + nab)
   - Add WRITE_INSERT + READ_STRIP T10 PI support in target-core (nab + Sagi)
   - Fix iscsi-target ERL=2 ASYNC_EVENT connection pointer bug (nab)
   - Fix tcm_fc use-after-free of ft_tpg (Andy Grover)
   - Use correct ib_sg_dma primitives in ib_isert (Mike Marciniszyn)

  Also, note the virtio-scsi + vhost-scsi changes to expose T10 PI
  metadata into KVM guest have been left-out for now, as there where a
  few comments from MST + Paolo that where not able to be addressed in
  time for v3.15.  Please expect this feature for v3.16-rc1"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (43 commits)
  ib_srpt: Use correct ib_sg_dma primitives
  target/tcm_fc: Rename ft_tport_create to ft_tport_get
  target/tcm_fc: Rename ft_{add,del}_lport to {add,del}_wwn
  target/tcm_fc: Rename structs and list members for clarity
  target/tcm_fc: Limit to 1 TPG per wwn
  target/tcm_fc: Don't export ft_lport_list
  target/tcm_fc: Fix use-after-free of ft_tpg
  target: Add check to prevent Abort Task from aborting itself
  target: Enable READ_STRIP emulation in target_complete_ok_work
  target/sbc: Add sbc_dif_read_strip software emulation
  target: Enable WRITE_INSERT emulation in target_execute_cmd
  target/sbc: Add sbc_dif_generate software emulation
  target/sbc: Only expose PI read_cap16 bits when supported by fabric
  target/spc: Only expose PI mode page bits when supported by fabric
  target/spc: Only expose PI inquiry bits when supported by fabric
  target: Pass in transport supported PI at session initialization
  target/iblock: Fix double bioset_integrity_free bug
  Target/sbc: Initialize COMPARE_AND_WRITE write_sg scatterlist
  target/rd: T10-Dif: RAM disk is allocating more space than required.
  iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>target: Pass in transport supported PI at session initialization</title>
<updated>2014-04-07T08:48:54+00:00</updated>
<author>
<name>Nicholas Bellinger</name>
<email>nab@linux-iscsi.org</email>
</author>
<published>2014-04-02T19:52:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=e70beee783d6977d80eede88a3394f02eabddad1'/>
<id>e70beee783d6977d80eede88a3394f02eabddad1</id>
<content type='text'>
In order to support local WRITE_INSERT + READ_STRIP operations for
non PI enabled fabrics, the fabric driver needs to be able signal
what protection offload operations are supported.

This is done at session initialization time so the modes can be
signaled by individual se_wwn + se_portal_group endpoints, as well
as optionally across different transports on the same endpoint.

For iser-target, set TARGET_PROT_ALL if the underlying ib_device
has already signaled PI offload support, and allow this to be
exposed via a new iscsit_transport-&gt;iscsit_get_sup_prot_ops()
callback.

For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode
operation.

For all other drivers, set TARGET_PROT_NORMAL to disable fabric
level PI.

Cc: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Cc: Quinn Tran &lt;quinn.tran@qlogic.com&gt;
Cc: Giridhar Malavali &lt;giridhar.malavali@qlogic.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to support local WRITE_INSERT + READ_STRIP operations for
non PI enabled fabrics, the fabric driver needs to be able signal
what protection offload operations are supported.

This is done at session initialization time so the modes can be
signaled by individual se_wwn + se_portal_group endpoints, as well
as optionally across different transports on the same endpoint.

For iser-target, set TARGET_PROT_ALL if the underlying ib_device
has already signaled PI offload support, and allow this to be
exposed via a new iscsit_transport-&gt;iscsit_get_sup_prot_ops()
callback.

For loopback, set TARGET_PROT_ALL to signal SCSI initiator mode
operation.

For all other drivers, set TARGET_PROT_NORMAL to disable fabric
level PI.

Cc: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Cc: Quinn Tran &lt;quinn.tran@qlogic.com&gt;
Cc: Giridhar Malavali &lt;giridhar.malavali@qlogic.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>target: Add TFO-&gt;abort_task for aborted task resources release</title>
<updated>2014-04-07T08:48:51+00:00</updated>
<author>
<name>Nicholas Bellinger</name>
<email>nab@linux-iscsi.org</email>
</author>
<published>2014-03-22T21:55:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=131e6abc674edb9f9a59090bb35bf6650569b7e7'/>
<id>131e6abc674edb9f9a59090bb35bf6650569b7e7</id>
<content type='text'>
Now that TASK_ABORTED status is not generated for all cases by
TMR ABORT_TASK + LUN_RESET, a new TFO-&gt;abort_task() caller is
necessary in order to give fabric drivers a chance to unmap
hardware / software resources before the se_cmd descriptor is
released via the normal TFO-&gt;release_cmd() codepath.

This patch adds TFO-&gt;aborted_task() in core_tmr_abort_task()
in place of the original transport_send_task_abort(), and
also updates all fabric drivers to implement this caller.

The fabric drivers that include changes to perform cleanup
via -&gt;aborted_task() are:

  - iscsi-target
  - iser-target
  - srpt
  - tcm_qla2xxx

The fabric drivers that currently set -&gt;aborted_task() to
NOPs are:

  - loopback
  - tcm_fc
  - usb-gadget
  - sbp-target
  - vhost-scsi

For the latter five, there appears to be no additional cleanup
required before invoking TFO-&gt;release_cmd() to release the
se_cmd descriptor.

v2 changes:
  - Move -&gt;aborted_task() call into transport_cmd_finish_abort (Alex)

Cc: Alex Leung &lt;amleung21@yahoo.com&gt;
Cc: Mark Rustad &lt;mark.d.rustad@intel.com&gt;
Cc: Roland Dreier &lt;roland@kernel.org&gt;
Cc: Vu Pham &lt;vu@mellanox.com&gt;
Cc: Chris Boot &lt;bootc@bootc.net&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Giridhar Malavali &lt;giridhar.malavali@qlogic.com&gt;
Cc: Saurav Kashyap &lt;saurav.kashyap@qlogic.com&gt;
Cc: Quinn Tran &lt;quinn.tran@qlogic.com&gt;
Cc: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that TASK_ABORTED status is not generated for all cases by
TMR ABORT_TASK + LUN_RESET, a new TFO-&gt;abort_task() caller is
necessary in order to give fabric drivers a chance to unmap
hardware / software resources before the se_cmd descriptor is
released via the normal TFO-&gt;release_cmd() codepath.

This patch adds TFO-&gt;aborted_task() in core_tmr_abort_task()
in place of the original transport_send_task_abort(), and
also updates all fabric drivers to implement this caller.

The fabric drivers that include changes to perform cleanup
via -&gt;aborted_task() are:

  - iscsi-target
  - iser-target
  - srpt
  - tcm_qla2xxx

The fabric drivers that currently set -&gt;aborted_task() to
NOPs are:

  - loopback
  - tcm_fc
  - usb-gadget
  - sbp-target
  - vhost-scsi

For the latter five, there appears to be no additional cleanup
required before invoking TFO-&gt;release_cmd() to release the
se_cmd descriptor.

v2 changes:
  - Move -&gt;aborted_task() call into transport_cmd_finish_abort (Alex)

Cc: Alex Leung &lt;amleung21@yahoo.com&gt;
Cc: Mark Rustad &lt;mark.d.rustad@intel.com&gt;
Cc: Roland Dreier &lt;roland@kernel.org&gt;
Cc: Vu Pham &lt;vu@mellanox.com&gt;
Cc: Chris Boot &lt;bootc@bootc.net&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Giridhar Malavali &lt;giridhar.malavali@qlogic.com&gt;
Cc: Saurav Kashyap &lt;saurav.kashyap@qlogic.com&gt;
Cc: Quinn Tran &lt;quinn.tran@qlogic.com&gt;
Cc: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Cc: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: don't open-code sockfd_put()</title>
<updated>2014-04-02T03:19:10+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-03-06T01:39:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=09aaacf02a3e88870ed5cad038a5bc822c947904'/>
<id>09aaacf02a3e88870ed5cad038a5bc822c947904</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: validate vhost_get_vq_desc return value</title>
<updated>2014-03-28T20:10:35+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-27T10:53:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=a39ee449f96a2cd44ce056d8a0a112211a9b1a1f'/>
<id>a39ee449f96a2cd44ce056d8a0a112211a9b1a1f</id>
<content type='text'>
vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.

The code in question was introduced in commit
8dd014adfea6f173c1ef6378f7e5e7924866c923
    vhost-net: mergeable buffers support

CVE-2014-0055

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vhost fails to validate negative error code
from vhost_get_vq_desc causing
a crash: we are using -EFAULT which is 0xfffffff2
as vector size, which exceeds the allocated size.

The code in question was introduced in commit
8dd014adfea6f173c1ef6378f7e5e7924866c923
    vhost-net: mergeable buffers support

CVE-2014-0055

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: fix total length when packets are too short</title>
<updated>2014-03-28T20:07:31+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-27T10:00:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=d8316f3991d207fe32881a9ac20241be8fa2bad0'/>
<id>d8316f3991d207fe32881a9ac20241be8fa2bad0</id>
<content type='text'>
When mergeable buffers are disabled, and the
incoming packet is too large for the rx buffer,
get_rx_bufs returns success.

This was intentional in order for make recvmsg
truncate the packet and then handle_rx would
detect err != sock_len and drop it.

Unfortunately we pass the original sock_len to
recvmsg - which means we use parts of iov not fully
validated.

Fix this up by detecting this overrun and doing packet drop
immediately.

CVE-2014-0077

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When mergeable buffers are disabled, and the
incoming packet is too large for the rx buffer,
get_rx_bufs returns success.

This was intentional in order for make recvmsg
truncate the packet and then handle_rx would
detect err != sock_len and drop it.

Unfortunately we pass the original sock_len to
recvmsg - which means we use parts of iov not fully
validated.

Fix this up by detecting this overrun and doing packet drop
immediately.

CVE-2014-0077

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending</title>
<updated>2014-03-02T03:33:09+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-03-02T03:33:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=702256e604ec143f69b391485ab32b2948772838'/>
<id>702256e604ec143f69b391485ab32b2948772838</id>
<content type='text'>
Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the series are bugfixes for qla2xxx target NPIV support
  that went in for v3.14-rc1.  Also included are a few DIF related
  fixes, a qla2xxx fix (Cc'ed to stable) from Greg W., and vhost/scsi
  protocol version related fix from Venkatesh.

  Also just a heads up that a series to address a number of issues with
  iser-target active I/O reset/shutdown is still being tested, and will
  be included in a separate -rc6 PULL request"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
  qla2xxx: Fix kernel panic on selective retransmission request
  Target/sbc: Don't use sg as iterator in sbc_verify_read
  target: Add DIF sense codes in transport_generic_request_failure
  target/sbc: Fix sbc_dif_copy_prot addr offset bug
  tcm_qla2xxx: Fix NAA formatted name for NPIV WWPNs
  tcm_qla2xxx: Perform configfs depend/undepend for base_tpg
  tcm_qla2xxx: Add NPIV specific enable/disable attribute logic
  qla2xxx: Check + fail when npiv_vports_inuse exists in shutdown
  qla2xxx: Fix qlt_lport_register base_vha callback race
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull SCSI target fixes from Nicholas Bellinger:
 "The bulk of the series are bugfixes for qla2xxx target NPIV support
  that went in for v3.14-rc1.  Also included are a few DIF related
  fixes, a qla2xxx fix (Cc'ed to stable) from Greg W., and vhost/scsi
  protocol version related fix from Venkatesh.

  Also just a heads up that a series to address a number of issues with
  iser-target active I/O reset/shutdown is still being tested, and will
  be included in a separate -rc6 PULL request"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
  qla2xxx: Fix kernel panic on selective retransmission request
  Target/sbc: Don't use sg as iterator in sbc_verify_read
  target: Add DIF sense codes in transport_generic_request_failure
  target/sbc: Fix sbc_dif_copy_prot addr offset bug
  tcm_qla2xxx: Fix NAA formatted name for NPIV WWPNs
  tcm_qla2xxx: Perform configfs depend/undepend for base_tpg
  tcm_qla2xxx: Add NPIV specific enable/disable attribute logic
  qla2xxx: Check + fail when npiv_vports_inuse exists in shutdown
  qla2xxx: Fix qlt_lport_register base_vha callback race
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost/scsi: Check LUN structure byte 0 is set to 1, per spec</title>
<updated>2014-02-25T00:19:43+00:00</updated>
<author>
<name>Venkatesh Srinivas</name>
<email>venkateshs@google.com</email>
</author>
<published>2014-02-24T22:13:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=7fe412d07d881020022a188b95c63a19b651a391'/>
<id>7fe412d07d881020022a188b95c63a19b651a391</id>
<content type='text'>
The virtio spec requires byte 0 of the virtio-scsi LUN structure
to be '1'.

Signed-off-by: Venkatesh Srinivas &lt;venkateshs@google.com&gt;
Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The virtio spec requires byte 0 of the virtio-scsi LUN structure
to be '1'.

Signed-off-by: Venkatesh Srinivas &lt;venkateshs@google.com&gt;
Reviewed-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: fix a theoretical race in device cleanup</title>
<updated>2014-02-13T23:47:30+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-02-13T09:45:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=b0c057ca7e835b36c6050c7627634b664796c1d6'/>
<id>b0c057ca7e835b36c6050c7627634b664796c1d6</id>
<content type='text'>
vhost_zerocopy_callback accesses VQ right after it drops a ubuf
reference.  In theory, this could race with device removal which waits
on the ubuf kref, and crash on use after free.

Do all accesses within rcu read side critical section, and synchronize
on release.

Since callbacks are always invoked from bh, synchronize_rcu_bh seems
enough and will help release complete a bit faster.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vhost_zerocopy_callback accesses VQ right after it drops a ubuf
reference.  In theory, this could race with device removal which waits
on the ubuf kref, and crash on use after free.

Do all accesses within rcu read side critical section, and synchronize
on release.

Since callbacks are always invoked from bh, synchronize_rcu_bh seems
enough and will help release complete a bit faster.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: fix ref cnt checking deadlock</title>
<updated>2014-02-13T23:47:30+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-02-13T09:42:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.exis.tech/linux.git/commit/?id=0ad8b480d6ee916aa84324f69acf690142aecd0e'/>
<id>0ad8b480d6ee916aa84324f69acf690142aecd0e</id>
<content type='text'>
vhost checked the counter within the refcnt before decrementing.  It
really wanted to know that it is the one that has the last reference, as
a way to batch freeing resources a bit more efficiently.

Note: we only let refcount go to 0 on device release.

This works well but we now access the ref counter twice so there's a
race: all users might see a high count and decide to defer freeing
resources.
In the end no one initiates freeing resources until the last reference
is gone (which is on VM shotdown so might happen after a looooong time).

Let's do what we probably should have done straight away:
switch from kref to plain atomic, documenting the
semantics, return the refcount value atomically after decrement,
then use that to avoid the deadlock.

Reported-by: Qin Chuanyu &lt;qinchuanyu@huawei.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vhost checked the counter within the refcnt before decrementing.  It
really wanted to know that it is the one that has the last reference, as
a way to batch freeing resources a bit more efficiently.

Note: we only let refcount go to 0 on device release.

This works well but we now access the ref counter twice so there's a
race: all users might see a high count and decide to defer freeing
resources.
In the end no one initiates freeing resources until the last reference
is gone (which is on VM shotdown so might happen after a looooong time).

Let's do what we probably should have done straight away:
switch from kref to plain atomic, documenting the
semantics, return the refcount value atomically after decrement,
then use that to avoid the deadlock.

Reported-by: Qin Chuanyu &lt;qinchuanyu@huawei.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
