Avoid UUID.randomUUID() in file system related startup code#5450
Avoid UUID.randomUUID() in file system related startup code#5450geoand wants to merge 1 commit intoeclipse-vertx:masterfrom
Conversation
| // the cacheDir will be suffixed a unique id to avoid eavesdropping from other processes/users | ||
| // also this ensures that if process A deletes cacheDir, it won't affect process B | ||
| String cacheDirName = fileCacheDir + "-" + UUID.randomUUID(); | ||
| String cacheDirName = fileCacheDir + "-" + System.nanoTime(); |
There was a problem hiding this comment.
nanoTime is not absolute - it's relative to the process. Meaning that another application starting can have it again, without any need to be simultaneous - is it what you expect?
There was a problem hiding this comment.
nanoTime is not absolute - it's relative to the process
Correct. But we do think it can be problematic, I'm happy to use Random.getRandom() or System.currentTimeMillis()
There was a problem hiding this comment.
if we use Math.random() it get better - but is still not granted to be unique - because it still uses System::nanoTime and Random per se doesn't guarantee uniqueness across processes (try printing new Random(42).nextInt() running it twice with 2 diff processes...)
There was a problem hiding this comment.
I am pretty sure we are not looking for that such strong of a guarantee here, but I'll let the maintainers be the judge of that
There was a problem hiding this comment.
How about an optimistic attempt? Something like (simplifying):
for(;;) {
try {
String cacheDirName = fileCacheDir + "-" + System.nanoTime();
Files.createDirectories(cacheDirName);
break;
} catch(FileAlreadyExistException ignore) {
}
}There was a problem hiding this comment.
In fact, you could use a random instead of System.nanoTime, I think it would be faster
There was a problem hiding this comment.
@franz1981 is Random.nextLong() faster than System.nanoTime()?
There was a problem hiding this comment.
nope - or better - usually nanoTime (if not on the cloud with unreliable time sources) uses a thing called rdts which is as cheap as reading a memory area
|
|
||
| private String generateDeploymentID() { | ||
| return UUID.randomUUID().toString(); | ||
| return Long.valueOf(System.nanoTime()).toString(); |
There was a problem hiding this comment.
This needs to be globally unique when running in clustered mode with HA enabled
So perhaps something like:
if (vertx.isClustered() && vertx.haManager()!=null) {
return UUID.randomUUID().toString();
}
// Use a counter?There was a problem hiding this comment.
It's pretty common to deploy verticles concurrently. Even when Vert.x is not clustered, the returned value should be unique.
There was a problem hiding this comment.
I have updated it to use Random, is that what you meant?
There was a problem hiding this comment.
I meant incrementing an AtomicLong counter instead of using a random value (uniqueness is guaranteed and it shouldn't change the perf results you got)
|
what seems to take time is the initialization of SecureRandom.getDfaultPrng due to loading providers, I think w ecould generate a faster UUID by using a given provider |
But those are not public APIs, no? |
|
I think we should have a way to specify the exact cache dir (e.g. |
|
Sure, that would make sense for us too |
b28445c to
7c9fc8f
Compare
|
I have updated the PR per suggestions |
|
Is there anything else you would like me to do for this one? |
For usability, it seems to me adding a boolean to the options would be enough (it's what's computed in the end to determine if a UUID should be added to the path). But it's a matter of taste so I'm fine with keeping an extra dir option if you choose so @vietj |
|
Is there anything more you want me to do with this one? |
|
@vietj PTAL |
|
🙏🏽 |
vertx-core/src/main/java/io/vertx/core/file/FileSystemOptions.java
Outdated
Show resolved
Hide resolved
|
|
||
| public static final Logger log = LoggerFactory.getLogger(DefaultDeploymentManager.class); | ||
|
|
||
| private static final AtomicLong nextId = new AtomicLong(); |
There was a problem hiding this comment.
the idea of cache dir is to avoid that, and keep the same behaviour we have, so please no.
There was a problem hiding this comment.
So what do you propose? This was added as a response to #5450 (comment)
There was a problem hiding this comment.
I believe you confused two changes @vietj : the cache dir that used a random UUID, and the verticle id generator
There was a problem hiding this comment.
sorry, can we have the verticle id generator in another PR then ?
There was a problem hiding this comment.
I'd like to keep distinct PR for the changelog
There was a problem hiding this comment.
Distinct PR or commit?
8ba99cf to
c749aaa
Compare
vietj
left a comment
There was a problem hiding this comment.
can you add a test with a real vertx instance and check those cases
- a missing dir is created
- an existing dir is reused
- an error is thrown when a non dir file exist already
|
Sure, I'll do that when I'm back from JFokus |
|
Aren't those cases already covered by the tests for |
good question, I don't know :-) |
|
That's what I understand from looking at FileResolverTestBase |
| public void testGetTheExactCacheDirWithoutHacks() { | ||
| String cacheDir = new FileResolverImpl(new FileSystemOptions().setExactFileCacheDir(cacheBaseDir + "-exact")).cacheDir(); | ||
| if (cacheDir != null) { | ||
| System.out.println(cacheDir); |
| } | ||
|
|
||
| @Test | ||
| public void testGetTheExactCacheDirWithoutHacks() { |
There was a problem hiding this comment.
this test should be moved to FileCacheTest instead, FileResolverTestBase tests the behaviour of resolver implementations
vietj
left a comment
There was a problem hiding this comment.
We need tests that assesses the behaviour of creating a vertx instance when
- the cache dir already exists and is a directory (I guess it reuses the directory)
- the cache dir does not exist (it should create the missing directory)
- the cache dir string is not a valid value
- the cache dir does not exists and cannot be created, e.g. the parent path points to a file
- the cache dir exists but is not a directory, e.g. it is a file
|
I added all but |
| // also this ensures that if process A deletes cacheDir, it won't affect process B | ||
| String cacheDirName = fileCacheDir + "-" + UUID.randomUUID(); | ||
| File cacheDir = new File(cacheDirName); | ||
| File cacheDir = isEffectiveValue ? new File(fileCacheDir) : new File(fileCacheDir + "-" + UUID.randomUUID()); |
There was a problem hiding this comment.
Doesn't this change, break the comment above? when isEffectiveValue is true, 2 vert.x instances will interfere with each other's cache. While this is probably ok for the same application, if 2 applications differ, then it could cause invalid states.
One example is (regardless if the 2 applications are the same or not) The moment the 1st terminates, it deletes the cache and would also mean it was deleted for the second, causing inconsistencies and errors.
There was a problem hiding this comment.
I think it is fine because effective value is part of the options and users have to explicitly enable it, so they control it and have to care about it
|
Is there any chance we can move this forward? @gsmet also independently ran into it |
c94173c to
5a95731
Compare
|
I would really appreciate some input on this |
This is done because bootstrapping the plumbing needed by the JDK to produce a UUID value is expensive, it thus doesn't make sense to pay this cost when the property isn't actually needed
|
If this is approved, could it be backported to 4? |
|
@geoand I would jsut like before to change the configuration option, instead of passing a different path |
|
@vietj Thanks. Can you elaborate a little more on what exactly you would like to see, because I don't fully underastand your proposal |
Motivation:
This is done because bootstrapping the plumbing
needed by the JDK to produce a UUID value
is expensive, it thus doesn't make sense to
pay this cost when the property isn't actually
needed
Explain here the context, and why you're making that change, what is the problem you're trying to solve.
We are making an effort in Quarkus to improve startup time even further by eliminating various bottlenecks across the board.
The first call to
UUID.randomUUID()is definitely heavy (as shown in the following flamegraph) and if we can avoid it a startup code (as we have in the development branch of Quarkus), it would be nice.P.S. Ideally we would like to have this in Vert.x 4 as well.