Create Pipelines using SQL

If you want Decodable to transform your data before sending it to a destination, you must create a pipeline and apply it to one or more Decodable streams. A pipeline is a set of data processing instructions written in SQL or expressed as an Apache Flink job. See Create Pipelines using your own Apache FlinkĀ® Jobs. When you apply a pipeline to a stream, Decodable uses those instructions to process and transform the data that the pipeline receives.

Steps

Complete the following steps to create a pipeline that processes data from a Decodable stream, transforms it, and sends that data to another Decodable stream.

  1. Do one of the following to open the Pipeline Editor page.

    • On the Pipelines page, select New Pipeline and then select the stream that you’d like the pipeline to process data from.

    • On the Streams page, select the stream that contains the data that you want your pipeline to process. Select …​ next to Output/Outbound, and then select Create New Pipeline.

  2. To transform incoming data before sending it to a destination, modify the existing SQL statement. For more information and examples of the common SQL patterns used to transform data in a pipeline, see the following topics:

    • For the full list of functions that you can include in a Decodable pipeline, see Function Reference.

    • For examples of the most common pipeline patterns, such as how to write SQL that filters or masks data, or how to join two or more streams together to produce one enriched stream, or how to convert an append stream to a change stream or vice-versa, see Pipeline Examples.

    • To learn about how to perform summarizations on streaming data, see Windowing Reference.

  3. (Optional) To generate a preview of how your pipeline processes and transforms data, select the Run Preview button. Use the preview results to validate your SQL statement.

    • A preview session continues until 2 minutes have passed or 30 records have been sampled, whichever comes first.

    • You can specify the location in the data stream where the pipeline starts reading data by updating the Starting Position. If set to Latest, then the pipeline samples records starting from the latest position in the data stream. If set to Earliest, then the pipeline samples records starting from the very beginning of the data stream. Defaults to Latest.

  4. The destination that your pipeline sends data to is defined in the INSERT INTO statement. If a stream by that name doesn’t exist yet, Decodable will create the output stream for you.

  5. To save your pipeline, select Next and give your pipeline a name and description.

You can now start the pipeline to begin processing and transforming data as specified in the pipeline configuration.