-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obtain Bitcask write locks using system call flock/2
#257
Conversation
…le creation Previous to this change, the OS pid was used to lock the bitcask database and prevent two bitcask instances performing write operations at the same time. However, if bitcask is not able to clean up the write locks because of a hard shutdown and another process takes that the pid in written to the lock file, bitcask is not able to take the lock even though it is the only running instance. This change uses the flock system call to obtain exclusive write access to the file across OS processes, as long as the other processes use flock to obtain the lock.
Is this something for which it is even possible to create a Riak test for? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code is clean. The tests all pass for me. Raised a question about copyright, but that's not something I personally care about either way, just be aware it may spawn discussion.
I don't have the expertise to say whether this is the right approach to the problem from a c/systems level, though the tests certainly show that it works.
I'd feel better with a more competent c/systems eye on it before it is merged (@tburghart, @Vagabond, (@mrallen1, I notice you HEART this PR) ANY OTHER?)
+1 from me, for what it is worth
@@ -3,6 +3,7 @@ | |||
// bitcask: Eric Brewer-inspired key/value store | |||
// | |||
// Copyright (c) 2010 Basho Technologies, Inc. All Rights Reserved. | |||
// Copyright (c) 2018 Workday, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary? We had a chat about this recently (I'll copy you in via mail.)
Oh yeah, and if you do merge it, please close the related issues. |
Is there a Riak top level project branch somewhere which is already configured to use your branch @andytill ? The code looks +1 but , just to be sure, I would like to run it against some data if possible. |
Sorry no, it's all linked in our internal riak_ee.
|
Might it be worth considering https://www.perkin.org.uk/posts/solaris-portability-flock.html is a good summary of the topic. |
Thanks @Licenser. You linked to https://github.com/basho/leveldb/blob/44cb7cbc85590280c9a73856470d5880f4015927/util/env_posix.cc#L493 in slack, I'm just commenting here to bring that link into this discussion |
Very difficult I think to reproduce the OS pid on the restarted Riak to provoke the original bug. With the change it is possible to trigger the locked error by starting starting Riak, perform one write, starting a second Riak and performing a write with that which will return an error.
|
Oh yes, sorry @russelldb should have added that too. |
I’m happy without a riak_test, if there is no test to show the fault pre-fix.
…On 6 Mar 2018, at 10:58, Andy Till ***@***.***> wrote:
Very difficult I think to reproduce the OS pid on the restarted Riak to provoke the original bug. With the change it is possible to trigger the locked error by starting starting Riak, perform one write, starting a second Riak and performing a write with that which will return an error.
Is this something for which it is even possible to create a Riak test for?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I'd definitely echo the suggestion to look at fnctl, the link Heinz posted even has some compatibility shims in addition to the eleveldb one. |
On the CI server (my laptop) the quickcheck tests fail - output attached. It would be brilliant if we could get the tests to pass - then +1 from me. With the current implementation I caused data loss (last year) when opening a bitcask data directory twice, despite the the second open was flagged as read_only (via a second beam) any attention to this area feels useful. Imho - |
wrt above comment - emfile - I've increased file descriptors and rerun the quickcheck tests. |
@bryanhuntesl that failure is emfile, are you sure you have your ulimit high enough? |
Increased emfile limit to 10000 tests still failing - output attached. |
@bryanhuntesl can you add some info (erl + eqc version, platform) I ran this 10 times today without failure, on my macbook |
Certainly - Macbook - i7/16GB
|
Not gonna moan about basho1 (should be basho10). Interesting. All the same as me. Ah well. Good luck someone with that one! |
Just to confirm - I am running 16b02-basho10 - seems like it's been truncated in the erl console output. |
I am using bitcask on SmartOS. But that aside I see little reason not to use a POSIX compliant function when there is one. |
Sorry, my mistake, I thought you were doing everything with raw files/Postgres. |
So many systems ;) in dalmatinerDB yes, it's files/postgres/(leveldb sadly that's tied to _core), snarl my AAA system uses bitcask for the timeouts. I'm happy to make a pullrequest on the pullrequest to move it to fnctl, it seems just to be once place that needs changes. |
Thanks for clarification @Licenser |
@Licenser thanks for offering to make this Solaris compatible, I'd welcome this change to the PR. |
The lock test does seem to fail on OSX (unmodified code):
update 1Odd this did work on the second run, not sure as to why. update 2The eqc tests segfault :(
update 3the calls tack would be:
|
Still waiting to get my eqc license approved by work, then I'll try to reproduce. |
With |
Previous to this change, the OS pid was used to lock the bitcask database and prevent two bitcask instances performing write operations at the same time. If the pid in the lock file is currently taken by an OS process then the bitcask DB may not be written to. Bitcask is not able to clean up the write locks because of a hard shutdown it may not be able to delete the "stale" lock file and obtain a new one for the following reasons:
This has been raised in issues #72, #226 and basho/riak#535.
This change uses the flock system call to obtain exclusive write access to the file across OS processes, as long as the other processes use flock
to obtain the lock.
The change maintains the previous behaviour where the DB can be opened even if a write lock cannot be obtained, but returns an error on write operations.
The
O_EXCL
flag has been removed from the call toopen
in the nifs because if the lock file still exists, it returns theeexist
error but now the proof of the write lock is not the file existing but the call toflock
.