Time-based lookup joins

A time-based lookup join is a type of join that allows you to join two data streams based on the temporal (time-based) relationship between their records. Specifically, it allows you to join records from one stream with records from another stream based on a time window or interval. This type of join is sometimes referred to as a temporal join.

In Decodable, time-based lookup joins can be performed based on either event time (historical lookup join) or processing time (latest value lookup join).

  • Historical lookup join: In historical lookup joins, the join is performed based on the timestamps in each record that represent when the record’s event took place. In other words, historical lookup joins are joins based on event time. Historical lookup joins enable you to lookup historical values, so you can match the timestamp of a record in one stream with the timestamp of another record in another stream.

  • Latest value lookup join: In latest value lookup joins, for each incoming record from an append stream, Decodable looks up the latest value from a change stream. This means that the join operation is performed based on the system time at which the records are processed, which can be different from the time at which the record’s event actually occurred.

Example

In this example, we’ll use a historical lookup join to join two streams of data. One stream, an append stream called taxi_fares, contains transaction events with varying currencies. The other stream, called exchange_rates, contains a temporal table from a change stream with exchange rates that fluctuate over time. You want to join these two streams together so you have one change stream that contains the transaction with the currency exchange rate in effect at the time of the transaction.

Taxi_Fares

Time Price Currency

9:50

2

USD

10:15

1

EUR

10:30

100

YEN

11:15

3

USD

11:50

5

EUR

Exchange_Rates

Time Currency Rate

9:00

YEN

1

9:15

EUR

141

9:30

USD

136

10:45

USD

140

11:30

EUR

143

You can join these two streams and output a single stream of converted fares.

Converted_Fares

Time Converted_fare

9:00

272

9:00

141

9:15

100

10:50

420

11:00

715

An example of what the pipeline looks like:

INSERT INTO Converted_Fares SELECT
    time,
    price,
    currency,
FROM taxi_fares t
JOIN exchange_rates r FOR SYSTEM_TIME AS OF t.time
ON t.currency = r.currency

The following diagram illustrates how each incoming transaction is matched with the correct exchange rate at the time of the transaction.

timebasedwindowjoinexample