| Age | Commit message (Collapse) | Author | Files | Lines |
|
Replace with a lock_debug_script member in ctdb_context.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Use path_helperdir() to help construct the path and then cache the
result in the existing static buffer (with length adjusted because
POSIX says the +1 is not necessary). Given the way this is used by
cluster_mutex_test, there is no (other) sane place to cache it.
path_helperdir_append() could be used to construct the path, but then
there would be an unnecessary talloc() result to free.
The flexibility in unit test cluster_mutex_003.sh was never used, so
remove this test. If other cluster mutex helpers are added then they
can be tested by separate tests.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Replace with a lock_helper member in ctdb_context, set using
path_helperdir_append().
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Simplify the initialisation of the path to eventd in eventd_context
using path_helperdir_append().
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Remove CTDB_RECOVERY_HELPER, CTDB_TAKEOVER_HELPER. Add new struct
members in ctdb_recoverd to contain the paths, set via
path_helperdir_append().
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
This currently causes binary data to be logged.
Instead, conditionally hex encode the key in a similar style to the
way it is done in dbwrap_ctdb.c:fetch_locked_internal(). In this
case, the key is truncated if the debug level is less than 10.`
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Feb 9 12:21:08 UTC 2026 on atb-devel-224
|
|
ctdb_shutdown_sequence() normally exits. When we end up here, it is
because we have received a reclock callback twice. We can't handle
that, we have already removed "state", which would be referenced deep
in run_start_recovery_event() returning here another time.
The bug is triggered since b84fbd7b3fedc998 introduced a nested event
loop, making ctdb_shutdown_sequence() return into
start_recovery_reclock_callback() due to multiple reclock checks being
triggered somehow (not sure exactly how, but we should not crash under
any circumstance).
Reproducer: Run one ctdb daemon with cluster lock set, try to start
another one without cluster lock set.
Bug: https://bugzilla.samba.org/show_bug.cgi?id=15950
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Nov 19 03:04:13 UTC 2025 on atb-devel-224
|
|
This should really be a takeip. However, CTDB's weak check of the IP
address state (using bind(2)) incorrectly indicates that the IP
address is assigned to an interface so it is converted to an updateip.
After commit 0536d7a98b832fc00d26b09c26bf14fb63dbf5fb (which improves
IP address state checking), this will almost certainly not occur on
platforms with getifaddrs(3) (e.g. Linux). This means it is only
likely to occur in 4.21 when net.ipv4.ip_nonlocal_bind=1.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15935
Reported-by: Bailey Allison <ballison@45drives.com>
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
At the moment CTDB_SOCKET can be used outside of test mode even though
nobody should do this. So, no longer allow this.
This means ensuring CTDB_TEST_MODE is set in the in the
"clusteredmember" selftest environment, so that CTDB_SOCKET is
respected there..
Details...
The associated use of chown(2) and chmod(2), used to secure the socket
in ctdb_daemon.c:ux_socket_bind(), potentially enables a symlink race
attack. However, the chown(2) is currently not done in test mode, so
restricting the use of CTDB_SOCKET to test mode solves the potential
security issue.
Also, sprinkle warnings about use of CTDB_TEST_MODE in appropriate
places, just to attempt to limit unwanted behaviour.
An alternative could be to use the socket file descriptor with
fchown(2) and fchmod(2). However, these system calls are not well
defined on sockets. Still, this was previously done in CTDB's early
days (using the poorly documented method where they are allowed in
Linux (only?) before calling bind(2)). It was removed (due to
portability issues, via commits
cf1056df94943ddcc3d547d4533b4bc04f57f265 and
2da3fe1b175a468fdff4aa4f65627facd2c28394) and replaced with the
current post-bind chown(2) and chmod(2).
I would like to remove the CTDB_SOCKET environment variable entirely,
since setting CTDB_TEST_MODE and CTDB_BASE covers all reasonable test
environments. However, I have a feeling that people use it for
interactive testing, and that can still be done in CTDB_TEST_MODE.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15921
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reported-by: *GUIAR OQBA * <techokba@gmail.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Thu Sep 25 09:02:06 UTC 2025 on atb-devel-224
|
|
If a delayed broadcast by a previous cluster lock holder arrives, the
new legitimate leader will accept this without questioning in
leader_handler(). Without this patch rec->leader will never be
overwritten, and because rec->pnn != rec->leader we'll also never send
out fresh leader broadcasts. And because we hold the cluster lock,
nobody else can step up.
Fix this in the next round of leader broadcast timeout.
Bug: https://bugzilla.samba.org/show_bug.cgi?id=15892
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Aug 7 02:59:20 UTC 2025 on atb-devel-224
|
|
Change the variable name to "path" so it makes sense to reuse it for
the directory.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Jul 23 00:02:47 UTC 2025 on atb-devel-224
|
|
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Dropping this from ctdb_tunable_load_file() allows that function to be
called multiple times for different files. The caller sets the
defaults.
In the test script, factor out the handling of a single tunables file
in a similar way. Ignoring missing/unreadable files is OK because
this function will only be called for test successes (hence "ok" in
the name). There will never be existing, unreadable files. The code
being tested ignores missing files, so do that here too.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15858
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu May 29 10:57:35 UTC 2025 on atb-devel-224
|
|
See documentation change for details.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15858
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Even though all nodes may be shutting down there is still a very small
window for a race when multiple nodes are shut down. For simplicity,
assume 2 nodes. Assume the shutdowns of nodes are staggered, which is
usual because they're usually initiated by a loop (e.g. onnode -p all
ctdb shutdown). Although commands can continue in parallel, some
commands are started later than others.
Consider this sequence:
1. Node 0 reaches ctdb_shutdown_takeover() in
ctdb_shutdown_sequence() and a takeover run starts
2. Node 1 has not yet set its runlevel to SHUTDOWN in
ctdb_shutdown_sequence()
3. The leader node asks node 1 which IPs it can host
4. Node 1 replies "all of them"
5. Node 1 now sets its runlevel to SHUTDOWN in
ctdb_shutdown_sequence()
6. The leader node continues with the takeover run, first asking all
nodes to run "startipreallocate"
7. Node 0 runs "startipreallocate", so its NFS server starts grace
8. Node 1 does not run "startipreallocate" because it is not in
RUNNING runstate, so its NFS server does not start grace
9. The leader node continues with the takeover run, first asking all
nodes to run "releaseip" for IPs they can no longer hold
10. Node 0 releases all IPs, since it is SHUTDOWN runstate (so can't
host IPs)
11. As part of this, the NFS server on node 0 releases locks held
against IPs it is releasing
12. A client connected to node 1, where the NFS server is not in
grace, takes ("steals") one of those locks
This client is then permitted to reclaim the lock when nodes are
restarted.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15858
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Allows the timeout for failover during shutdown to be modified.
Defaults to 10s.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15858
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
SQ
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Without this, NFS servers on other nodes will not go into grace before
this node releases locks. This should also support improved behaviour
for SMB durable file handles.
The timeout is currently a constant 10s. However, it will
subsequently be switched to an option.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15858
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
An early shutdown can put ctdbd into SHUTDOWN runstate before ctdbd
has completed all early initialisation. Some of the start-time
transitions then attempt to set the runstate to FIRST_RECOVERY or
RUNNING, which would make the runstate go backwards, so ctdbd aborts.
Upcoming changes cause ctdbd shutdown to take longer, so the problem
will become more likely. With those changes, this can be
unreliably (50% of the time?) triggered by:
ctdb/tests/INTEGRATION/simple/cluster.091.version_check.sh
since it does an early shutdown due to a version mismatch.
Avoid this by noticing when the runstate is SHUTDOWN and refusing to
continue with subsequent early initialisation steps, which aren't
needed when shutting down.
Earlier runstate transitions do not seems likely to cause an abort
during early shutdown. The following:
./tests/local_daemons.sh foo start 0; ./tests/local_daemons.sh foo stop 0
sees ctdbd already into FIRST_RECOVERY before the shutdown is
processed.
The change to ctdb_run_startup() probably isn't strictly necessary.
There will be no abort in this case. ctdb_shutdown_sequence() will
always run the "shutdown" event and then stop the event daemon, so it
doesn't seem possible that services could be left running. However,
we might as well avoid running the "startup" event when shutting down,
even if only to avoid confusing logs.
Ultimately, it seems like some redesign would be needed to avoid this
in a more predictable manner, rather than responding when an early
initialisation step inconveniently completes during shutdown. For
example, hanging a lot of the start-time event handling off a common
talloc context, could allow it to be cancelled with a single
TALLOC_FREE(). However, a change like that would involve a lot of
analysis to ensure that the talloc hierarchy is correct and there is
no change of free'd pointers being dereferenced. So, we're probably
better off just keeping this issue in mind during a broader redesign.
This workaround appears to be sufficient.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15858
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
bzero() function has been deprecated for a long time.
In samba it is replaced with memset(). But samba also
provides common memory-zeroing macros, like ZERO_STRUCT().
In all places where bzero() is used, it actually meant to
zero a structure or an array.
So replace these bzero() calls with ZERO_STRUCT() or
ZERO_ARRAY() as appropriate, and remove bzero() replacement
and testing entirely.
While at it, also stop checking for presence of memset() -
this function is standard for a very long time, and the
only conditional where HAVE_MEMSET were used, was to
provide replacement for bzero() - in all other places
memset() is used unconditionally.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Pavel Filipenský <pfilipensky@samba.org>
|
|
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jerry Heyman <jheyman@ddn.com>
|
|
We should not directly overwrite the pointer we are realloc'ing
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
Initialise the pointer to NULL and fall through to let
talloc_realloc() do the allocation. talloc_realloc() does the right
thing with a NULL pointer...
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jerry Heyman <jheyman@ddn.com>
|
|
This is cheap when tcparray is NULL and let's the code that now
follows be simplified.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jerry Heyman <jheyman@ddn.com>
|
|
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jerry Heyman <jheyman@ddn.com>
|
|
This is harmless, so it doesn't generally need to be logged.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jerry Heyman <jheyman@ddn.com>
|
|
Apply README.Coding, modernise logging, pre-render connection as a
string for logging, switch terminology from "tickle" to "connection",
tidy up comments.
No changes in functionality.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jerry Heyman <jheyman@ddn.com>
|
|
Reorder code to use early returns, modernise debug.
Best reviewed with "git show -w".
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
Autobuild-User(master): Anoop C S <anoopcs@samba.org>
Autobuild-Date(master): Tue Oct 8 06:42:04 UTC 2024 on atb-devel-224
|
|
Fix the comment (NULL versus -1), apply some README.Coding.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Uses of CTDB_BASE in the subsequent code are now handled by the path
module, so there is no point getting the value of CTDB_BASE. Instead,
check that the attempt to set it worked, noting that:
[...] if overwrite is zero, then the value of name is not
changed (and setenv() returns a success status).
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
No need to use CTDB_BASE directly.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Add some missing error handling and error messages.
Remove a use of CTDB_NO_MEMORY(), which then renders the caller's use
of ctdb_errstr() pointless, so remove that too.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Modernise the debug macros along the way.
These are done separately because they will require a little more
patience to review.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Define a static function to return the string. This clearly doesn't
need a ctdb_ prefix, but it matches ctdb_vnn_iface_string(), so
doesn't look out of place.
Use it in the places where review is trivial.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
These are currently converted to strings constantly in log messages
and other places. This clutters the code and probably has a minor
performance impact.
Add a new string field to the VNN structure. Populate it when a
public address is added and the VNN structure is allocated. This is
consistent with how node addresses are handled.
Don't use it yet, or this commit becomes huge.
A short-term goal is that each VNN public address will be converted to
a string only once. A longer-term goal is to reduce use of
ctdb_addr_to_str().
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
The word "no" was accidentally dropped in commit
1e47a1b3f6ab1e2ad9d86dfb28c3e086c99a97e5.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Unused since commit a10545ab6bd8a1b9ca87b0fdba8381cb8af0e284.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Currently, event failures are completely ignored in favour of checking
if the IP is on an interface. This misses the case where event
scripts up to and including 10.interface succeed, but something later
fails. When that occurs, count is incremented, so the failure is
counted as a success in the summary that is logged.
Fail when releaseip fails even though 10.interface succeeded in
releasing the IP. This may result in the IP address coming back, but
that's a different problem.
Underlying this is a design question about when releaseip is
successful. Should releaseip be a distinct operation, with subsequent
reconfigurations considered separately?
Update logging to clearly identify each of the 3 possible errors.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
It is more efficient calling ctdb_sys_local_ip_check() inside a loop
compared to calling ctdb_sys_have_ip(). There is a chance that this
is premature optimisation... but it sure is easy. Fall back to
checking with bind().
I think these checks really exist because of the weirdness fixed by
commit 4b4e4d8870475d994fe42a7b2c57dc69842d91f6. However, we might as
well do what we can.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Improve readability by not repeating the complex expression now
assigned to addr. ctdb_sys_have_ip() is called in both arms of the
if/else, so call it once when declaring the new variable.
Modernise debug macros while touching lines.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Anoop C S <anoopcs@samba.org>
|
|
Saves lines, str_list_add_printf takes care of NULL checks
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Sun Sep 22 10:44:59 UTC 2024 on atb-devel-224
|
|
I could not find out how to cast a char ** to const char ** without
warning. This transfers fine to the execv call as well.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
Saves lines, str_list_add_printf takes care of NULL checks
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Noel Power <noel.power@suse.com>
|
|
Don't hide the real action inside an if-branch
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Noel Power <noel.power@suse.com>
|
|
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jennifer Sutton <jsutton@samba.org>
|
|
Code to setup the transport is about to be cleaned up, including
removing uses of ctdb_set_error(), so avoid logging a NULL pointer or
some other old error.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com>
Reviewed-by: Volker Lendecke <vl@samba.org>
|
|
In a future commit we will add support for loading the config file from
the `ctdb` command line tool. Prior to this change the config file load
func always called D_NOTICE that causes the command to emit new text and
thus break all the tests that rely on the specific test output (not to
mention something users could notice). This change plumbs a new
`verbose` argument into some of the config file loading functions.
Generally, all existing functions will have verbose set to true to match
the existing behavior. Future callers of this function can set it to
false in order to avoid emitting the extra text.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
Rename ctdb_load_nodes_file to ctdb_load_nodes as it can now load nodes
from more than a regular file.
Signed-off-by: John Mulligan <jmulligan@redhat.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|