Redshift vs. Snowflake

Redshift vs Snowflake

 

In the following table, I shall highlight key features of the two data warehousing options Redshift versus Snowflake. In the case of Redshift, keep in mind that there are various node types and sizes and features will differ across them.

Redshift

Snowflake

Cloud Support

If you’re an AWS workshop, Redshift is going to integrate better in your ecosystem. For example, assuming the team has a Postgres database hosted in AWS that acts as the operations DB for your application, you can manage the transfer of data from the PostgresDB into Redshift within AWS using tools such as AWS’s Database Migration Service or Glue. It would mean that an external 3rd party ETL tool would not be needed if the team wants the application data to stay within the AWS account.

Snowflake offers multi-cloud support and while it does not seamlessly integrate in certain cloud ecosystems such as BigQuery with GCP and Redshift with AWS, it still integrates well with 3rd party SaaS tools and also has a Marketplace that offers datasets and data products that integrate directly with the Snowflake platform.

Deployment and Maintenance

A redshift cluster can be spun up in 5 minutes. The team also already has one provisioned.


Redshift generally has required more periodic maintenance than other more fully managed data warehouses such as Snowflake and BigQuery. There have been improvements in behind the scenes maintenance such as the introduction of automatic table optimization, however, one may still find themselves needing to manage the correct sort and distribution keys for tables and vaccuming to ensure continued optimal performance.

Signing up for an account takes 10 minutes and the initial provisioning of the system (adding key roles, warehouses, grants etc) takes about 2 hours. The process would be to signup, create an account, and invite me to your account to continue the setup.


Snowflake is a fully managed platform that requires little to no maintenance. Most of the maintenance is spent optimizing queries which has to be done on every warehouse.

Data Features / Limitations


  • VARCHAR (text) fields have a limit of 

65,535 bytes. You may run into issues storing very large texts.

  • Offers some semi-structured data support with the Super data type
  • There are limitations with cross database queries, e.g. they’re not supported on dc2 node types. Cross-database queries allow users to join data across different databases within the same cluster.
  • Better handling of very large datasets (petabytes)

  • VARCHAR (text) fields have a limit of 16,777,216 bytes.
  • Offers semi-structured data support with VARIANT, OBJECT and ARRAY data types.
  • Cross-database queries fully supported

Scalability

Clusters can be scaled up and down, resized, nodes added etc, however depending on the task, downtime for maintenance may be required.

Warehouses can be provisioned with almost no downtime. There is also flexibility on how you can scale, i.e. you can have certain tasks running on larger warehouses, and other tasks on smaller warehouses and scale up and down as needed.

Pricing

Redshift offers a variety of options. It offers node configurations that combine storage and compute (d2c) or separate storage and compute (ra3, Serverless)


Pricing is going to vary depending on the node type, size and quantity.


Sample pricing assuming 100% uptime:


https://aws.amazon.com/blogs/apn/amazon-redshift-benchmarking-comparison-of-ra3-vs-ds2-instance-types/


dc2.large (1 node) – $0.25 per hour

ra3.4xlarge (2 nodes) – $6.67 per hour

Snowflake separates storage and compute. Consequently, if there are no queries being run against the warehouse, you’re really only paying for storage. 1TB storage in AWS US East is $23 per month


https://www.snowflake.com/en/data-cloud/pricing-options/


https://docs.snowflake.com/en/user-guide/cost-understanding-compute


https://docs.snowflake.com/en/user-guide/warehouses-overview 


Computing costs assuming 100% uptime:


Standard – X-Small – $2 per hour

Standard – X-Large – $32 per hour

Business Critical – X-Large – $64 per hour