Redshift vs Snowflake
In the following table, I shall highlight key features of the two data warehousing options Redshift versus Snowflake. In the case of Redshift, keep in mind that there are various node types and sizes and features will differ across them.
Redshift | Snowflake |
Cloud Support | |
If you’re an AWS workshop, Redshift is going to integrate better in your ecosystem. For example, assuming the team has a Postgres database hosted in AWS that acts as the operations DB for your application, you can manage the transfer of data from the PostgresDB into Redshift within AWS using tools such as AWS’s Database Migration Service or Glue. It would mean that an external 3rd party ETL tool would not be needed if the team wants the application data to stay within the AWS account. | Snowflake offers multi-cloud support and while it does not seamlessly integrate in certain cloud ecosystems such as BigQuery with GCP and Redshift with AWS, it still integrates well with 3rd party SaaS tools and also has a Marketplace that offers datasets and data products that integrate directly with the Snowflake platform. |
Deployment and Maintenance | |
A redshift cluster can be spun up in 5 minutes. The team also already has one provisioned. Redshift generally has required more periodic maintenance than other more fully managed data warehouses such as Snowflake and BigQuery. There have been improvements in behind the scenes maintenance such as the introduction of automatic table optimization, however, one may still find themselves needing to manage the correct sort and distribution keys for tables and vaccuming to ensure continued optimal performance. | Signing up for an account takes 10 minutes and the initial provisioning of the system (adding key roles, warehouses, grants etc) takes about 2 hours. The process would be to signup, create an account, and invite me to your account to continue the setup. Snowflake is a fully managed platform that requires little to no maintenance. Most of the maintenance is spent optimizing queries which has to be done on every warehouse. |
Data Features / Limitations | |
65,535 bytes. You may run into issues storing very large texts.
|
|
Scalability | |
Clusters can be scaled up and down, resized, nodes added etc, however depending on the task, downtime for maintenance may be required. | Warehouses can be provisioned with almost no downtime. There is also flexibility on how you can scale, i.e. you can have certain tasks running on larger warehouses, and other tasks on smaller warehouses and scale up and down as needed. |
Pricing | |
Redshift offers a variety of options. It offers node configurations that combine storage and compute (d2c) or separate storage and compute (ra3, Serverless) Pricing is going to vary depending on the node type, size and quantity. Sample pricing assuming 100% uptime: dc2.large (1 node) – $0.25 per hour ra3.4xlarge (2 nodes) – $6.67 per hour | Snowflake separates storage and compute. Consequently, if there are no queries being run against the warehouse, you’re really only paying for storage. 1TB storage in AWS US East is $23 per month https://www.snowflake.com/en/data-cloud/pricing-options/ https://docs.snowflake.com/en/user-guide/cost-understanding-compute https://docs.snowflake.com/en/user-guide/warehouses-overview Computing costs assuming 100% uptime: Standard – X-Small – $2 per hour Standard – X-Large – $32 per hour Business Critical – X-Large – $64 per hour |