A Data Model

For the purposes of these examples, let's imagine we have a data model defined by the following Objects:

public class Movie {
    int id;
    String title;

    Set<ActorRole> cast;

public class ActorRole {
    Actor actor;
    int movieId;
    String characterName;

public class Actor {
    int actorId;
    String name;

Primary Key Indexes

When we generate a client API, each type in our data model gets a custom index class called <typename>PrimaryKeyIndex. We can use these classes to look up records based on primary key values.

Default Primary Keys

Once we have loaded a dataset into a HollowConsumer, we can use the Movie index to retrieve data by the default primary key id:

HollowConsumer consumer = ...;

MoviePrimaryKeyIndex idx = new MoviePrimaryKeyIndex(consumer);

int knownMovieId = ...;

Movie movie = idx.findMatch(knownMovieId);

Just as the HollowConsumer will automatically stay up-to-date as your dataset updates, a primary key index will also stay up-to-date with the HollowConsumer with which it is backed.

Share Indexes

Queries to indexes are thread-safe. We should create each of the indexes we need only once, and share them everywhere they are needed.

Consumer-specified Primary Keys

In the prior example, our primary key index was using the default primary key defined in the data model. A primary key index is not restricted to just default primary keys. For example, we could also index movies by their title:

MoviePrimaryKeyIndex idx = new MoviePrimaryKeyIndex(consumer, "title");

String knownMovieTitle = ...;

Movie movie = idx.findMatch(knownMovieTitle);

Primary Keys

A primary key index should be used when there is a one-to-one mapping between records and key values. A primary key can only return one record per key, and if multiple records exist for a given key, then an arbitrary match will be returned.

Compound Primary Keys

A primary key index may also be specified over multiple fields. For example, we can define a primary key index for the ActorRole type above:

ActorRolePrimaryKeyIndex idx = new ActorRolePrimaryKeyIndex(consumer, "actor.id", "movieId");

int knownActorId = ...;
int knownMovieId = ...;

ActorRole actorRoleInMovie = idx.findMatch(knownActorId, knownMovieId);

In the above example, we are looking for the actor role which matches both the actor ID and the movie ID. Note that the actor id was specified with dot-notation as actor.id. This is a field path, and indicates that the actual value we're indexing belongs to a referenced record. Note that for a primary key index, we can only traverse through referenced Object records, not List, Set, or Map records. We'll cover more about field paths a bit further down.

Hash Indexes

If we want to find records based on keys for which there is not a one-to-one mapping between records and key values, we want a hash index. With our generated client API, we have a single class <API classname>HashIndex. We can use instances of this class to specify hash indexes. A hash index must specify each of a query type, a select field, and one or more match fields. If we want to select the same type we are using to search, we should specify our select field as and empty String "".

For example, if we want to match Movie records which had characters with some name, we can use the following:

MovieAPIHashIndex idx = new MovieAPIHashIndex("Movie", "", "cast.element.characterName.value");

String knownCharacterName = ...;

for(Movie movie : idx.findMovieMatches(knownCharacterName)) {
    System.out.println("The movie " + movie.getTitle().getValue() + 
                       " has a character named " + knownCharacterName);

Above, we are selecting the same type from which our query is derived. However, if we wanted to find Actor records which starred in Movie records that have a specific title, we need to formulate our query at the Movie level, but we are selecting a different node:

MovieAPIHashIndex idx = new MovieAPIHashIndex("Movie", "cast.element.actor", "title.value");

String knownMovieTitle = ...;

for(Actor actor : idx.findActorMatches(knownMovieTitle)) {
    System.out.println("The actor " + actor.getName().getValue() +
                       " starred in " + knownMovieTitle);

We can also match at multiple places in a type hierarchy. For example, if we want to find the ActorRole by actor id and movie title, we can use the following:

MovieAPIHashIndex idx = new MovieAPIHashIndex("Movie", "cast.element", 
                                              "cast.element.actor.actorId", "title.value");

String knownMovieTitle = ...;
int knownActorId = ...;

for(ActorRole role : idx.findActorRoleMatches(knownActorId, knownMovieTitle)) {
    System.out.println("The actor " + role.getActor().getName().getValue() +
                       " starred in " + knownMovieTitle + 
                       " as " + role.getCharacterName().getValue());

Field Paths

A field path indicates how to traverse through a type hierarchy. It contains multiple parts delimited by ., and we need one part per type through which we're traversing. Each part corresponding to an OBJECT type should be equal to the name of a field in that type.

Primary key and hash key field paths may only span through OBJECT types. These field paths will be automatically expanded if they end in a REFERENCE field which points to a type that has only a single field, or a type which has a primary key with only a single field defined. If auto-expansion is not desired, the field path should terminate with a ! character. For example, in our data model example above, the following field paths for the type Movie are equivalent: title, title.value. If we actually want the field path to terminate at the REFERENCE field title, we can specify the field path as title!.

Hash index field paths may span through any type. Each part corresponding to a LIST or SET type should be specified as element. Similarly, each part corresponding to a MAP type should be specified as either key or value. Hash index field paths are never auto-expanded.

Hash Keys

Notice that in the POJOs of our data model defined at the beginning of this topic, we annotated the Set<ActorRole> in the Movie type with @HollowHashKey(fields="actor.actorId"). This means that for each of these sets, the data will be hashed by the actor ID in the contained record. In our generated API, we can easily find any record by actor ID using the findElement() method. For example:

Movie movie = ...;
int knownActorId = ...;

ActorRole role = movie.getCast().findElement(knownActorId);

In this way, each of our set records can be indexed by any field, or combination of fields, for O(1) retrieval of contained records. The rules for defining a hash key are similar to the rules for defining a primary key:

  • Compound hash keys may be defined by specifying multiple fields.
  • Field paths may only span through OBJECT types.
  • Field paths will be auto-expanded if they terminate in a REFERENCE field.
  • Should be used when there is a one-to-one mapping between records and keys per set. If duplicates exist, an arbitrary valid match will be returned.

If defined on a set type, hash key field paths should be defined starting from the element type.

Hash keys may also be defined on map record types. When defined on a map record, the field paths should be defined starting from the key type. The methods findKey(), findValue(), and findElement() are available on map types in the generated API for consumers to look up records by hash key values.

If using the HollowObjectMapper, unspecified hash keys will be automatically selected if an element or key type contain a single non-reference field. Addionally, if a Set or Map references Object elements with a defined primary key, then the hash key will default to the primary key of the element type. Alternatively, hash keys can be explicitly defined using the @HollowHashKey annotation in POJOs for Set schemas by specifying one or more fields from the element type, or for Map schemas by specifying one or more fields from the key type. See our data model example at the beginning of this section for an example.

Field Match Scan Queries

Each of the examples above pre-index your dataset to achieve O(1) lookup times. These are very efficient, but require pre-knowledge of what you're searching for. Given that all of hollow datasets exist in memory, for some use cases it is reasonable to scan through the entire dataset looking for matches.

The HollowFieldMatchQuery can be used to accommodate these use cases. The Hollow Explorer UI, for example, uses this mechanism to provide a powerful "search-for-anything" capability with reasonable response times for low-volume query traffic.

Diving Deeper

Lower-level interfaces are available to index data in the absence of a generated API. See Diving Deeper: Indexing Data for Retrieval for a detailed look.