AWS Redshift

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This allows you to use your data to gain new insights for your business and customers.

The first step to create a data warehouse is to launch a set of nodes, called an Amazon Redshift cluster. After you provision your cluster, you can upload your data set and then perform data analysis queries. Regardless of the size of the data set, Amazon Redshift offers fast query performance using the same SQL-based tools and business intelligence applications that you use today.

Overview

Connector name

redshift

Type

sink

Getting Started

Sending a Decodable data stream to Redshift is accomplished in two stages, first by creating a sink connector to Kinesis, and then by materializing a view from a Kinesis stream and merging it into Redshift.

Configure As A Sink

This example demonstrates using Kinesis as the sink from Decodable and the source for Redshift. Sign in to Decodable Web and follow the configuration steps provided for the AWS Kinesis to create a sink connector. For examples of using the command line tools or scripting, see the How To guides.

Configure Streaming Ingestion

Previously, loading data from a streaming service like Amazon Kinesis Streams into Amazon Redshift included several steps. These included connecting the stream to an Amazon Kinesis Data Firehose and waiting for Kinesis Data Firehose to stage the data in Amazon S3, using various-sized batches at varying-length buffer intervals. After this, Kinesis Data Firehose triggered a COPY command to load the data from Amazon S3 to a table in Redshift.

Rather than including preliminary staging in Amazon S3, streaming ingestion provides low-latency, high-speed ingestion of stream data from Kinesis into Redshift by following these steps.

  1. Create a Decodable Kinesis sink connection

  2. Create an IAM role

  3. Assign the IAM to Redshift

  4. Define an external schema

  5. Create a materialized view

  6. Refresh the materialized view

  7. Merge the data

For more detailed information, please refer to the Redshift example in the Decodable GitHub repository, or Redshift’s Streaming Ingestion documentation.