samba.git/lib/ntdb/test, branch talloc-2.0.8

lib/ntdb: Fix format string errors found by -Werror=format in ntdb tests

2012-07-30T04:25:10+00:00

ntdb: test arbitrary operations during ntdb_parse_record().

2012-06-22T05:35:17+00:00

In particular, this tests that we can store enough records to make the
database expand while we map the given record.  We use a global lock for
this, but it could happen in theory with another process.

It also tests the that we can recurse inside ntdb_parse_record().

Signed-off-by: Rusty Russell

ntdb: make database read-only during ntdb_parse() callback.

2012-06-22T05:35:17+00:00

Since we have a readlock, any write will grab a write lock: if it happens
to be on the same bucket, we'll fail.

For that reason, enforce read-only so every write operation fails
(even for NTDB_NOLOCK or NTDB_INTERNAL dbs), and document it!

Signed-off-by: Rusty Russell

ntdb: don't munmap the database on every close.

2012-06-22T05:35:17+00:00

Since we can have multiple openers, we should leave the mmap in place
for the other openers to use.  Enhance the test to check the bug (it
still works, because without mmap we fall back to read/write, but
performance would be terrible!).

Signed-off-by: Rusty Russell

ntdb: respect TDB_NO_FSYNC flag for 'make test'

2012-06-22T05:35:16+00:00

This reduces test time from 31 seconds to 6, on my laptop.

Signed-off-by: Rusty Russell

ntdb: fix occasional abort in testing.

2012-06-20T14:50:20+00:00

Occasionally, the capability test inserts multiple used records and they
clash, but our primitive test layout engine doesn't handle hash clashes
and aborts.

Force a seed value which we know doesn't clash.

Reported-by: Andrew Bartlett 
Signed-off-by: Rusty Russell 

Autobuild-User(master): Rusty Russell 
Autobuild-Date(master): Wed Jun 20 16:50:20 CEST 2012 on sn-devel-104

ntdb: optimize ntdb_fetch.

2012-06-19T03:38:07+00:00

We access the key on lookup, then access the data in the caller.  It
makes more sense to access both at once.  We also put in a likely()
for the case where the hash is not chained.

Before:
Adding 1000 records: 3644-3724(3675) ns (129656 bytes)
Finding 1000 records: 1596-1696(1622) ns (129656 bytes)
Missing 1000 records: 1409-1525(1452) ns (129656 bytes)
Traversing 1000 records: 1636-1747(1668) ns (129656 bytes)
Deleting 1000 records: 3138-3223(3175) ns (129656 bytes)
Re-adding 1000 records: 3278-3414(3329) ns (129656 bytes)
Appending 1000 records: 5396-5529(5426) ns (253312 bytes)
Churning 1000 records: 9451-10095(9584) ns (253312 bytes)
smbtorture results (--entries=1000)
ntdb speed 183881-191112(188223) ops/sec

After:
Adding 1000 records: 3590-3701(3640) ns (129656 bytes)
Finding 1000 records: 1539-1605(1566) ns (129656 bytes)
Missing 1000 records: 1398-1440(1413) ns (129656 bytes)
Traversing 1000 records: 1629-2015(1710) ns (129656 bytes)
Deleting 1000 records: 3118-3236(3163) ns (129656 bytes)
Re-adding 1000 records: 3235-3355(3275) ns (129656 bytes)
Appending 1000 records: 5335-5444(5385) ns (253312 bytes)
Churning 1000 records: 9350-9955(9494) ns (253312 bytes)
smbtorture results (--entries=1000)
ntdb speed 180559-199981(195106) ops/sec

ntdb: remove hash table trees.

2012-06-19T03:38:07+00:00

TDB2 started with a top-level hash of 1024 entries, divided into 128
groups of 8 buckets.  When a bucket filled, the 8 bucket group
expanded into pointers into 8 new 64-entry hash tables.  When these
filled, they expanded in turn, etc.

It's a nice idea to automatically expand the hash tables, but it
doesn't pay off.  Remove it for NTDB.

1) It only beats TDB performance when the database is huge and the
   TDB hashsize is small.  We are about 20% slower on medium-size
   databases (1000 to 10000 records), worse on really small ones.
2) Since we're 64 bits, our hash tables are already twice as expensive
   as TDB.
3) Since our hash function is good, it means that all groups tend to
   fill at the same time, meaning the hash enlarges by a factor of 128
   all at once, leading to a very large database at that point.
4) Our efficiency would improve if we enlarged the top level, but
   that makes our minimum db size even worse: it's already over 8k,
   and jumps to 1M after about 1000 entries!
5) Making the sub group size larger gives a shallower tree, which
   performs better, but makes the "hash explosion" problem worse.
6) The code is complicated, having to handle delete and reshuffling
   groups of hash buckets, and expansion of buckets.
7) We have to handle the case where all the records somehow end up with
   the same hash value, which requires special code to chain records for
   that case.

On the other hand, it would be nice if we didn't degrade as badly as
TDB does when the hash chains get long.

This patch removes the hash-growing code, but instead of chaining like
TDB does when a bucket fills, we point the bucket to an array of
record pointers.  Since each on-disk NTDB pointer contains some hash
bits from the record (we steal the upper 8 bits of the offset), 99.5%
of the time we don't need to load the record to determine if it
matches.  This makes an array of offsets much more cache-friendly than
a linked list.

Here are the times (in ns) for tdb_store of N records, tdb_store of N
records the second time, and a fetch of all N records.  I've also
included the final database size and the smbtorture local.[n]tdb_speed
results.

Benchmark details:
1) Compiled with -O2.
2) assert() was disabled in TDB2 and NTDB.
3) The "optimize fetch" patch was applied to NTDB.

10 runs, using tmpfs (otherwise massive swapping as db hits ~30M,
despite plenty of RAM).

				Insert	Re-ins	Fetch	Size	dbspeed
				(nsec)	(nsec)	(nsec)	(Kb)	(ops/sec)
TDB (10000 hashsize):	
	100 records:		 3882	 3320	1609	   53	203204
	1000 records:		 3651	 3281	1571	  115	218021
	10000 records:		 3404	 3326	1595	  880	202874
	100000 records:		 4317	 3825	2097	 8262	126811
	1000000 records:	11568	11578	9320	77005	 25046

TDB2 (1024 hashsize, expandable):
	100 records:		 3867	 3329	1699	   17	187100	
	1000 records:		 4040	 3249	1639	  154	186255
	10000 records:		 4143	 3300	1695	 1226	185110
	100000 records:		 4481	 3425	1800	17848	163483
	1000000 records:	 4055	 3534	1878   106386	160774

NTDB (8192 hashsize)
	100 records:		 4259	 3376	1692	   82	190852
	1000 records:		 3640	 3275	1566	  130	195106
	10000 records:		 4337	 3438	1614	  773	188362
	100000 records:		 4750	 5165	1746	 9001	169197
	1000000 records:	 4897	 5180	2341	83838	121901

Analysis:
	1) TDB wins on small databases, beating TDB2 by ~15%, NTDB by ~10%.
	2) TDB starts to lose when hash chains get 10 long (fetch 10% slower
	   than TDB2/NTDB).
	3) TDB does horribly when hash chains get 100 long (fetch 4x slower
	   than NTDB, 5x slower than TDB2, insert about 2-3x slower).
	4) TDB2 databases are 40% larger than TDB1.  NTDB is about 15% larger
	   than TDB1

ntdb: allocator attribute.

2012-06-19T03:38:07+00:00

This is designed to allow us to make ntdb_context (and NTDB_DATA returned
from ntdb_fetch) a talloc pointer.  But it can also be used for any other
alternate allocator.

Signed-off-by: Rusty Russell

ntdb: simply disallow NULL names.

2012-06-19T03:38:06+00:00

TDB allows this for internal databases, but it's a bad idea, since the
name is useful for logging.

They're a hassle to deal with, and we'd just end up putting "unnamed"
in there, so let the user deal with it.  If they don't, they get an
informative core dump.

Signed-off-by: Rusty Russell