Skip to content

OAK-12010 Simplified index management (without optimizer)#2689

Open
thomasmueller wants to merge 6 commits intotrunkfrom
OAK-12010-subset
Open

OAK-12010 Simplified index management (without optimizer)#2689
thomasmueller wants to merge 6 commits intotrunkfrom
OAK-12010-subset

Conversation

@thomasmueller
Copy link
Member

@thomasmueller thomasmueller commented Jan 19, 2026

This is a subset of the (much) larger draft PR #2652

This is just support for the "diff.index"; without the optimizer.

@sonarqubecloud
Copy link

Copy link
Contributor

@bhabegger bhabegger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean work 👍

I like the test coverage and the comments among the rest !

I don't see anything blocking and just have a few comments/suggestions.

if (!diffContent.exists()) {
continue;
}
PropertyState lastMod = diffContent.getProperty("jcr:lastModified");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a constant definition for this type of "well-known" entries ? (same for :lastProcessed, jcr:data, /oak:index/, etc.)

In my opinion this helps a lot to understand, identify in one go what are the 'system' properties, paths, etc., and know where they are used. But also have them documented in the code base as javadoc.

(Not a blocker, just a curiosity 😉)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There probably is. I agree we are not consistently using these constants, so there is a risk of typos.

* @param store the node store
* @param indexDefinitions the /oak:index node
*/
public static void applyDiffIndexChanges(NodeStore store, NodeBuilder indexDefinitions) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ease readability maybe we could split this method in :

  • parsing entries
  • diffing entries

and delegate the orchestration to a caller method ?

JsonObject repositoryDefinitions = RootIndexesListService.getRootIndexDefinitions(indexDefinitions);
LOG.debug("Index list {}", repositoryDefinitions.toString());
try {
DiffIndexMerger.instance().merge(newImageLuceneDefinitions, repositoryDefinitions, store);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge also seams to do something with diff.index and diff.index.optimizer couldn't we call the mergeDiff directly instead (without a dependency to the NodeStore)?


// merge
JsonObject merged = null;
if (indexDiff == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, maybe this premerge null checking logic could be moved down into the processMerge itself ? (The "nothing to do" case could be handled by a null or Optional merged result)

*
* "null" entries are not supported.
*/
public class JsonNodeBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Builder in the name could be misleading as, unless I missed something. this class doesn't seem to follow the "Builder" pattern, and seams more like a Helper class (but then there are different style tastes out their).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... It's a converter from "Json" to "NodeBuilder". It's confusing.

What about JsonNodeWriter, or JsonToNodeConverter?

Copy link
Contributor

@bhabegger bhabegger Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer Converter among the two as Writer is quite connotated (PrintWriter, StringWriter, etc.). However, both Writer and Converter still suggests some sort of instantiation and don't highlight that this class is a utility class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JsonNodeConversionUtil ?
JsonNodeConversionHelper ?

import java.util.ArrayList;

import org.apache.felix.inventory.Format;
import org.apache.jackrabbit.oak.commons.json.JsonObject;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that OAK has it's own JSON processing library for historical reasons ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

+ " } }");

repositoryDefinitions = RootIndexesListService.getRootIndexDefinitions(store, "lucene");
assertSameJson("{\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame Java 11 is a bit too early for text blocks :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree (btw. text blocks are not "raw" and still need escaping).

But I think it's readable as is.

public void createDummy() {
// when enabling "deleteCreatesDummyIndex", then a dummy index is created
// (that indexes /dummy, which doesn't exist)
String merged = new DiffIndexMerger(new String[0], true, true, false).processMerge(JsonObject.fromJson("{}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which true is it ?

Suggested change
String merged = new DiffIndexMerger(new String[0], true, true, false).processMerge(JsonObject.fromJson("{}"
boolean deleteCreatesDummyIndex = true;
String merged = new DiffIndexMerger(new String[0], deleteCreatesDummyIndex, true, false).processMerge(JsonObject.fromJson("{}"

Would make it explicit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, "list of boolean flags" is not user friendly... thinking about an alternative.

thomasmueller and others added 2 commits February 2, 2026 16:18
…/diff/DiffIndex.java

Co-authored-by: Benjamin Habegger <benjamin@habegger.tech>
…/diff/DiffIndex.java

Co-authored-by: Benjamin Habegger <benjamin@habegger.tech>
Copy link
Contributor

@amit-jain amit-jain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me!!
A few minor suggestions added.

Comment on lines +140 to +141
InputStream in = jcrData.getValue(Type.BINARY).getNewStream();
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
InputStream in = jcrData.getValue(Type.BINARY).getNewStream();
try {
try (InputStream in = jcrData.getValue(Type.BINARY).getNewStream()) {

return result;
}

public static Collection<String> filterNewestIndexes(Collection<String> indexPaths, NodeState rootState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rootState ununsed parameter, consider removing

return StringUtils.convertBytesToHex(md.digest(bytes));
} catch (NoSuchAlgorithmException e) {
// SHA-256 is guaranteed to be available in standard Java platforms
throw new RuntimeException("SHA-256 algorithm not available", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe IllegalStateException is better?

merged.getProperties().remove("reindex");
if (!deleteCopiesOutOfTheBoxIndex && indexDiff.toString().equals("{}")) {
merged.getProperties().put("type", "\"disabled\"");
merged.getProperties().put("mergeComment", "\"This index is superseeded and can be removed\"");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
merged.getProperties().put("mergeComment", "\"This index is superseeded and can be removed\"");
merged.getProperties().put("mergeComment", "\"This index is superseded and can be removed\"");

}
}
for (String r : toRemove) {
LOG.info("Removing old index " + r);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOG.info("Removing old index " + r);
LOG.info("Removing old index {}", r);

* (input)
* @return whether a new version of an index was added
*/
public boolean processMerge(String indexName, JsonObject indexDiff, JsonObject newImageLuceneDefinitions, JsonObject combined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes recommend breaking this method, example (from comments)

  • findLatestVersions
  • performMerge
  • isChanged
  • addMetadata

}
LOG.info("Processing a new diff.index with node store {}", store);
JsonObject repositoryDefinitions = RootIndexesListService.getRootIndexDefinitions(indexDefinitions);
LOG.debug("Index list {}", repositoryDefinitions.toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOG.debug("Index list {}", repositoryDefinitions.toString());
LOG.debug("Index list {}", repositoryDefinitions);

*/
public static void addOrReplace(NodeBuilder builder, NodeStore nodeStore, String targetPath, String nodeType, String jsonString) throws CommitFailedException, IOException {
LOG.info("Storing {}: {}", targetPath, jsonString);
if (nodeType.indexOf("/") >= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (nodeType.indexOf("/") >= 0) {
if (nodeType.contains("/")) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants