Glossary

adjacent state

If state A is connected via a single delta to state B, then A and B are adjacent to each other.

announce

After the blobs for a state have been published to a blob store by a producer, the state must be announced to consumers. The announcement signals to consumers that they should transition to the announced state.

blob

A blob is a file used by consumers to update their dataset. A blob will be either a snapshot, delta, or reverse delta

blob store

A blob store is a file store to which blobs can be published by a producer and retrieved by a consumer.

broken delta chain

When a blob namespace contains a state which is not adjacent to any prior states, the delta chain is said to be broken. In this scenario, consumers may need to load a double snapshot.

consumer

One of many machines on which a dataset is made accessible. Consumers are updated in lock-step based on the actions of the producer.

cycle

A producer runs in an infinite loop. Each exection of the loop is called a cycle. Each cycle produces a single data state.

data model

A data model defines the structure of a dataset. It is specified with a set of schemas.

data state

A dataset changes over time. The timeline for a changing dataset can be broken down into discrete data states, each of which is a complete snapshot of the data at a particular point in time.

deduplication

Two records which have identical data in Hollow will be consolidated into a single record. Any references to duplicate records will be mapped to the canonical one when a dataset is represented with Hollow.

delta

A set of encoded instructions to transition from one data state to an adjacent state. Deltas are encoded as a set of ordinals to remove and a set of ordinals to add, along with the accompanying data to add. 'Delta' may refer specifically to a transition between an earlier state and a later state, contrasted with 'reverse delta', which specifically refers to a transition between a later state and an earlier state.

delta chain

A series of states which are all connected via contiguous deltas.

diff

A comprehensive accounting for the differences between two data states.

double snapshot

When a consumer already has an initialized state and an announcement signals to move to a new state for which a path of deltas is not available, the consumer may transition to that state via a snapshot. In this scenario two full copies of the dataset must be loaded in memory.

field

A single value encoded inside of a Hollow record.

hash key

A user-defined specification of one or more fields used to hash elements into a set or entries into a map.

ingestion

Gathering data from a source of truth and importing it into Hollow.

inline

A field for which the value is encoded directly into a record, as opposed to referenced via another record.

namespace (blobs)

An addressable, logical separation of both published artifacts in a blob store and announcement location. Used to allow multiple publishers to communicate on separate channels to specific groups of consumers.

namespace (references)

The deliberate creation of a type to hold a specific referenced field's data in order to reduce the cardinality of the referenced records.

object longevity

A technique used to ensure that stale references to Hollow Objects always return the same data they did initially upon creation. Configured via the HollowObjectMemoryConfig.

ordinal

An integer value uniquely identifying a record within a type. Because records are represented with a fixed-length number of bits, the only necessary information to locate a record in memory is the record's type and ordinal. Ordinals are automatically assigned by Hollow, and are recycled as records are removed and added. Consequently, they lie in the range of 0-n, where n is generally not much larger than the total number of records for the type.

patch (states)

Creating a series of two deltas between states in a delta chain.

pinning

Overriding the state version announcement from the producer, to force clients to go back to or stay at an older state.

primary key

A user-defined specification of one or more fields used to uniquely identify a record within a type.

producer

A single machine that retrieves all data from a source of truth and produces a delta chain.

publish

Writing blobs to a blob store.

read state engine

A HollowReadStateEngine, the root handle to a Hollow dataset as a consumer.

record

A strongly-typed collection of fields or references, the structure of which is specified by a schema.

reference

A field type which indicates a pointer to another field. Can also refer to the technique of pulling out a specific field into a record type of its own to deliberately allow Hollow to deduplicate the values.

restore

Initializing a HollowWriteStateEngine with data from a previously produced state so that a delta may be created during a producer's first cycle.

reverse delta

A delta from a later state to an earlier state. Generally used during pinning scenarios.

schema

Metadata about a Hollow type which defines the structure of the records.

snapshot

A blob type which contains a serialization of all of the records in a type. Consumed during initialization, and possibly in a broken delta chain scenario.

state

See data state.

state version

A unique identifier for a state. Should by monotonically increasing as time passes.

state engine

Both the producer and consumers handle datasets with a state engine. A state engine can be transitioned between data states. A producer uses a write state engine and a consumer uses a read state engine

type

A collection of records all conforming to a specific schema.

write state engine

A HollowWriteStateEngine, the root handle to a Hollow dataset as a consumer.