Best Practices for Transforming and Analyzing Data in Your Data Lake – AWS Online Tech Talks
Companies are increasingly building data lakes in order to apply modern analytic techniques to what was previously siloed data sources. However, to create a useful data lake the data needs to be transformed, partitioned, and cataloged so that different data consumers are able to draw useful insights from data that has been optimized for analytics. In this tech talk, learn about best practices for transforming your data so that it is optimized for analytics, including partitioning strategies and the use of columnar based file formats. We’ll also talk about the different AWS services that can be used to catalog, transform, and analyze data in the data lake.
– Learn about best practices for partitioning your data to optimize for analytics
– Understand the benefits of using columnar based file formats for your analytics
– Learn about AWS services that can be used to catalog, transform, and analyze data in your data lake