In the Quick Start guide, we got a reference implementation of Hollow up and running, with a mock data model that can be easily modified to suit any use case. After reading this section, you'll have an understanding of the basic usage patterns for Hollow, and how each of the core pieces fit together.

Core Concepts

Hollow manages datasets which are built by a single producer, and disseminated to one or many consumers for read-only access. A dataset changes over time. The timeline for a changing dataset can be broken down into discrete data states, each of which is a complete snapshot of the data at a particular point in time.

Both the producer and consumers handle datasets with a state engine. A state engine can be transitioned between data states. A producer uses a write state engine and a consumer uses a read state engine.

Producing a Data Snapshot

Let's assume we have a POJO class Movie:

public class Movie {
    long id;
    String title;
    int releaseYear;

    public Movie(long id, String title, int releaseYear) {
        this.id = id;
        this.title = title;
        this.releaseYear = releaseYear;
    }
}

And that many Movies exist which comprise a dataset that needs to be disseminated:

List<Movie> movies = Arrays.asList(
        new Movie(1, "The Matrix", 1999),
        new Movie(2, "Beasts of No Nation", 2015),
        new Movie(3, "Pulp Fiction", 1994)
);

We'll need a data producer to create a data state which will be transmitted to consumers:

HollowWriteStateEngine writeEngine = new HollowWriteStateEngine();
HollowObjectMapper mapper = new HollowObjectMapper(writeEngine);

for(Movie movie : movies)
    mapper.addObject(movie);

OutputStream os = ...; /// where to write the blob
HollowBlobWriter writer = new HollowBlobWriter(writeEngine);
writer.writeSnapshot(os);

A HollowWriteStateEngine is the main handle to a Hollow dataset for a data producer. A HollowObjectMapper is one of a few different ways to populate a HollowWriteStateEngine with data. When starting with POJOs, it's the easiest way.

We'll use a HollowBlobWriter to write the current state of a HollowWriteStateEngine to an OutputStream. We call the data which gets written to the OutputStream a blob.

Publishing Blobs

For the purposes of testing, this blob can be written to local disk. In a production scenario, it can be written to a remote file store such as Amazon S3 for retrieval by consumers.

Consumer API Generation

Once the data has been populated into a state engine, that state engine is aware of the data model, and can be used to automatically produce a client API:

HollowAPIGenerator generator = 
       new HollowAPIGenerator(
           "MovieAPI",                    /// A name for the API
           "com.netflix.hollow.example",  /// A package where the API will live
           writeEngine                    /// our state engine
       );

generator.generateFiles("/path/to/files");

After this code executes, an set of Java files will be written to the location /path/to/files. These java files will be a generated API based on the data model defined by the schemas in our state engine, and will provide convenient methods to access that data.

Consuming a Data Snapshot

A data consumer can load a snapshot created by the producer into memory:

HollowReadStateEngine readEngine = new HollowReadStateEngine();
HollowBlobReader reader = new HollowBlobReader(readEngine);

InputStream is = /// where to load the snapshot from
reader.readSnapshot(is);

A HollowReadStateEngine is our main handle to a Hollow dataset as a consumer. A HollowBlobReader is used to consume blobs into a HollowReadStateEngine. Above, we're consuming a snapshot blob in order to initialize our state engine.

Once this dataset is loaded into memory, we can access the data for any records using our generated API:

MovieAPI movieApi = new MovieAPI(readEngine);

for(MovieHollow movie : movieApi.getAllMovieHollow()) {
    System.out.println(movie._getId() + ", " + 
                       movie._getTitle()._getValue() + ", " + 
                       movie._getReleaseYear());
}

The output of the above code will be:

1, The Matrix, 1999
2, Beasts of No Nation, 2015
3, Pulp Fiction, 1994

Producing a Delta

Some time has passed and the dataset has evolved. It now contains these records:

List<Movie> movies = Arrays.asList(
        new Movie(1, "The Matrix", 1999),
        new Movie(2, "Beasts of No Nation", 2015),
        new Movie(4, "Goodfellas", 1990),
        new Movie(5, "Inception", 2010)
);

The producer, with the same HollowWriteStateEngine in memory, needs to communicate this updated dataset to consumers. The data for the new state must be added to the state engine, after which a transition from the previous state to the new state can be written as a delta blob:

writeEngine.prepareForNextCycle();

for(Movie movie : movies)
    mapper.addObject(movie);

OutputStream os = ....; /// where to write the delta blob
writer.writeDelta(os);

Let's take a closer look at what the above code does. The same HollowWriteStateEngine which was used to produce the snapshot blob is used -- it already knows everything about the prior state and can be transitioned to the next state. We call prepareForNextCycle() to inform the state engine that the writing of blobs from the prior state is complete, and populating data into the next state is about to begin. When creating a new state, all of the movies currently in our dataset are re-added again. It's not necessary to figure out which records were added, removed, or modified -- that's Hollow's job.

We can (but don't have to) use the same HollowObjectMapper and/or HollowBlobWriter as we used in the prior cycle to create the initial snapshot.

The call to writeDelta() records a delta blob to the OutputStream. Encoded into the delta is a set of instructions to update a consumer’s read state engine from the previous state to the current state.

Producer Cycles

We call what the producer does to create a data state a cycle. During each cycle, we

  1. add all of the records from our dataset into the state engine, then
  2. write blobs (usually each of a snapshot, a delta, and a reverse delta blob) to our blob file store.

Consuming a Delta

Once a delta is announced the HollowReadStateEngine can be updated on the client:

InputStream is = /// where to load the delta from
HollowBlobReader reader = new HollowBlobReader(readEngine);
reader.applyDelta(is);

The same HollowReadStateEngine into which our snapshot was consumed must be reused to consume a delta blob. This state engine knows everything about the current state and can use the instructions in a delta to transition to the next state. We can (but don't have to) reuse the same HollowBlobReader.

After this delta has been applied, the read state engine is at the new state. If the generated API is used to iterate over the movies again as shown in the prior consumer example, the new output will be:

1, The Matrix, 1999
2, Beasts of No Nation, 2015
4, Goodfellas, 1990
5, Inception, 2010

Thread Safety

It is safe to use the HollowReadStateEngine to retrieve data while a delta transition is in progress.

Adjacent States

We refer to states which are directly connected via single delta transitions as adjacent states, and a continuous set of adjacent states as a delta chain

Delta Mismatch

If a delta application is attempted onto a HollowReadStateEngine which is at a state from which the delta did not originate, then an exception is thrown and the state engine remains safely unchanged.

Indexing Data for Retrieval

In prior examples the generated Hollow API was used by the data consumer to iterate over all Movie records in the dataset. Most often, however, it isn’t desirable to iterate over the entire dataset — instead, specific records will be accessed based on some known key. Let’s assume that the Movie’s id is a known key.

After consumers have populated a HollowReadStateEngine, the data can be indexed:

HollowPrimaryKeyIndex idx = 
                      new HollowPrimaryKeyIndex(readEngine, "Movie", "id");

idx.listenForDeltaUpdates();

This index can be held in memory and then used in conjunction with the generated Hollow API to retrieve Movie records by id:

int movieOrdinal = idx.getMatchingOrdinal(2);
if(movieOrdinal != -1) {
    MovieHollow movie = movieApi.getMovieHollow(movieOrdinal);
    System.out.println("Found Movie: " + movie._getTitle()._getValue());
}

Which outputs:

Found Movie: Beasts of No Nation

Keeping an Index Up To Date

The call to listenForDeltaUpdates() will cause a HollowPrimaryKeyIndex to automatically stay updated when deltas are applied to the indexed HollowReadStateEngine, but this should only be called if you intend to keep the index around. See the Indexing / Querying section for usage details.

Thread Safety

Retrievals from a HollowPrimaryKeyIndex are thread-safe. It is safe to use a HollowPrimaryKeyIndex from multiple threads, and it is safe to query while a transition is in progress.

Ordinals

Each record is assigned to a specific ordinal, which is an integer value. An ordinal:

  • is a unique identifier of the record within a type.
  • is sufficient to locate the record within a type.

Ordinals are automatically assigned by Hollow. They lie in the range of 0-n, where n is generally not much larger than the total number of records for the type.

Hierarchical Data Models

Our data models can be much richer than in the prior example. Assume an updated Movie class:

public class Movie {
    long id;
    String title;
    int releaseYear;
    List<Actor> actors;

    public Movie(long id, String title, int year, List<Actor> actors) {
        this.id = id;
        this.title = title;
        this.releaseYear = year;
        this.actors = actors;
    }
}

Which references Actor records:

public class Actor {
    long actorId;
    String actorName;

    public Actor(long actorId, String actorName) {
        this.actorId = actorId;
        this.actorName = actorName;
    }
}

Some records are added to a HollowWriteStateEngine on the producer:

List<Movie> movies = Arrays.asList(
        new Movie(1, "The Matrix", 1999, Arrays.asList(
                new Actor(101, "Keanu Reeves"),
                new Actor(102, "Laurence Fishburne"),
                new Actor(103, "Carrie-Ann Moss"),
                new Actor(104, "Hugo Weaving")
        )),
        new Movie(6, "Event Horizon", 1997, Arrays.asList(
                new Actor(102, "Laurence Fishburne"),
                new Actor(105, "Sam Neill")
        ))
);


for(Movie movie : movies)
    mapper.addObject(movie);

When we add these movies to the dataset, the HollowObjectMapper will traverse everything referenced by the provided records and add them to the state as well. Consequently, both a type Movie and a type Actor will exist in the data model after the above code runs.

Deduplication

Laurence Fishburne starred in both of these films. Rather than creating two Actor records for Mr. Fishburne, a single record will be created and assigned to both of our Movie records. This deduplication happens automatically by virtue of having the exact same data contained in both Actor inputs.

Consumers of this dataset may want to also create an index for Actor records:

HollowPrimaryKeyIndex actorIdx = 
                    new HollowPrimaryKeyIndex(readEngine, "Actor", "actorId");
actorIdx.listenForDeltaUpdates();

This index can be used in the same way as the Movie index to retrieve Actor records:

int actorOrdinal = actorIdx.getMatchingOrdinal(102);
if(actorOrdinal != -1) {
    ActorHollow actor = movieApi.getActorHollow(actorOrdinal);
    System.out.println("Found Actor: " + actor._getActorName()._getValue());
}

Which outputs:

Found Actor: Laurence Fishburne