| Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
ctdb reloadips will fail if it can't disable takover runs. The most
likely reason for this is that there is already a takeover run in
progress. We can't predict when this will happen, so retry if this
occurs.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Arguments are currently ignored anyway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This no longer does anything. Integration test cases now start and
shut down the cluster.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Integration test cases now start and shut down the cluster.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This is for the real cluster case.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Exit on first test failure instead of setting a variable. The bizarre
logic in ctdb_test_exit() makes this worth dropping.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
There are too many functions to start/stop daemons. Simplify this.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This test only runs against local daemons. Configuration is done via
script.options, which simplifies things quite a bit.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Preparation for recommending configuration for each script next to the
actual script.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This is the initial location that will be used by the new
multi-component aware event daemon.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This option adds a lot of unnecessary complexity to scripts.
Configuration should go in $CTDB_BASE, either directly or via a
symlink, so simplify by using the default location.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This was added for a vendor who decided not to use it. It is almost
certainly unused by anyone. If anyone really needs it then it is in
the git history.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
nmap-ncat is used in some distributions to replace netcat. It has a
different meaning for these options.
We can get the same effect as the current combination of -d and -w by
piping a sleep process to nc. Subsequent use of $! works because it
gets the last process in pipeline.
Note that redirecting from /dev/null doesn't work with some versions
of nc. They just exit when they get EOF.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Mar 9 12:24:13 CET 2018 on sn-devel-144
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12978
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
"ctdb reloadips" use of ipreallocate() can result in a spurious
takeover runs. This can cause a subsequent "ctdb reloadips" to fail
to disable takeover runs (due to there being one already in progress).
There are various possible improvements but a proper fix probably
requires a protocol change. That would mean receiving an ACK for a
takeover run request to indicate that the request will be processes
and then a broadcast to indicate a completed takeover run.
There are various other partial fixes (e.g. de-duping queued takeover
run requests against those in the in-progess queue) and workarounds
(e.g. always do a double ipreallocate() in the tool, which should
absorb the spurious takeover run).
However, this is unlikely to be a real-world problem. Real use cases
should not involve repeatedly reloading the IP configuration.
Instead, work around the problem of flaky tests by manually adding
"ctdb sync" commands to cause extra no-op takeover runs. These should
not add spurious takeover runs and will create synchronisation points
to help avoid the issue.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
These commands are now replaced with ctdb event ...
ctdb scriptstatus is maintained for backward compatibility.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Aug 18 02:50:16 CEST 2016 on sn-devel-144
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This drastically simplifies the code. "ctdb reloadips" behaves the
same, since it causes a takeover run immediately after IPs are
deleted. "ctdb delip" now needs to be followed with an explicit "ctdb
ipreallocate".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This test sets TakeoverTimeout=90 to avoid banning during takeover.
However, the setting is done on the test node instead of the recovery
master node. During "ctdb reloadips", the recovery master will used
the default value of TakeoverTimeout.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
Current NFS and CIFS tickle tests do not test the killtcp
functionality on the releasing node. 2-way killing is done for NFS,
so this test explicitly looks for packets from the releasing node.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
tcpdump does not support filtering on MAC address when reading from a
file. Therefore, this is implemented by conditionally using grep to
filter the output of tcpdump.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This makes the next commit much easier to read.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
There's a tiny chance that the connection information may not be
transferred to other nodes quickly enough, so add an explicit wait.
Also clean up the description and recognise that it is the takeover
node that does the tickling.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This is a gawk extension and can't be used reliably if just running
"awk". It is simple enough to switch to using the standard sub() and
gsub() functions.
The alternative is to switch to explicitly running "gawk". However,
although the eventscripts aren't exactly portable, it is probably
better to move closer to portability than further away.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
tcptickle_sniff_start() assumes that if $dst contains a ': then it
should use the IPv6 sniffing code. However, $dst is a socket, so has
a trailing ":<port>".
Strip the trailing ":<port>" before checking for ':' as a marker for
an IPv6 address.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
These tests simulate a dead node rather than a CTDB failure, so drop
IP addresses when killing a "node" to avoid problems with duplicates.
To cope with a CTDB failure a watchdog would be needed to ensure that
the public IPs are dropped when CTDB dies. Let's not do that now.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Fri Dec 5 23:29:39 CET 2014 on sn-devel-104
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This helps with debugging.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Extend select_test_node_and_ips() to set $test_prefix in addition to
$test_ip.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
There are parentheses missing that stop the default pattern from
matching commands with trailing garbage (e.g. "exportfs.orig").
A careful check of POSIX (and running GNU sed with --posix) suggests
that "\|" isn't a supported way of specifying alternation in a regular
expression. Therefore, it is clearer to switch to extended regular
expressions so that this has a chance of being portable (even though
the point is to print /proc/<pid>/stack, which only works on Linux).
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Nov 18 06:37:45 CET 2014 on sn-devel-104
|
|
Some of this implements logic that exists in functions. Some of it is
overly complicated and potentially failure-prone.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
Debugging can still be running when a monitor event times out and
scriptstatus output changes.
When debugging a hung script to a log file, write to a temporary file
and move the temporary file over the log file when done. The test
then waits for the log file to appear.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jul 3 08:19:23 CEST 2014 on sn-devel-104
|
|
About a year ago a check was added to _cluster_is_healthy() to make
sure that node 0 isn't in recovery. This was to avoid unexpected
recoveries causing tests to fail. However, it was misguided because
each test initially calls cluster_is_healthy() and will now fail if an
unexpected recovery occurs.
Instead, have cluster_is_healthy() warn if the cluster is in recovery.
Also:
* Rename wait_until_healthy() to wait_until_ready() because it waits
until both healthy and out of recovery.
* Change the post-recovery sleep in restart_ctdb() to 2 seconds and
add a loop to wait (for 2 seconds at a time) if the cluster is back
in recovery. The logic here is that the re-recovery timeout has
been set to 1 second, so sleeping for just 1 second might race
against the next recovery.
* Use reverse logic in node_has_status() so that it works for "all".
* Tweak wait_until() so that it can handle timeouts with a
recheck-interval specified.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
This one ensures that a newly started node gets an up-to-date tickle
list. Tweak some of the integration test functions to accommodate
this.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|