From the course: Delivering Data-Driven Decisions with AWS: Applying Machine Learning, Data Engineering, and Generative AI
Unlock the full course today
Join today to access over 24,600 courses taught by industry experts.
Overview of lakehouse architecture
From the course: Delivering Data-Driven Decisions with AWS: Applying Machine Learning, Data Engineering, and Generative AI
Overview of lakehouse architecture
- [Narrator] As data grows in petabytes, we can modernize data architecture to remove data silos. In this video, we will discuss data engineering concepts including data lake, data warehouse, data lakehouse, and data mesh. Think about this. Suddenly your boss says, "Let's move our data into the data lake." What is a data lake? It's a centralized repository for storing raw data. We can have more than one. It includes structured data like spreadsheets and relational databases, semi-structured data like JSON, XML, or unstructured data like text and MP3. Amazon S3 is a data lake for object storage and has high availability, durability, and scalability. Let's explore data warehouse. Data warehouses were very popular in the 1990s. They comprised of operational relational databases such as Oracle, MySQL and PostgreSQL. Operational databases perform online transaction processing, OLTP, like debiting your bank account.…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.