From the course: Data Engineering Pipeline Management with Apache Airflow

Unlock this course with a free trial

Join today to access over 24,600 courses taught by industry experts.

Purchases producer pipeline and join pipeline

Purchases producer pipeline and join pipeline

- [Instructor] With data aware scheduling, you can have a DAG wait for two data sets to be available before it'll be executed. And that's what we'll see here in this demo that'll span more than one movie. Now, here in my data sets folder, I have two CSV files, the customer's CSV file we've already seen and worked with earlier, the additional file here is for customer purchases. If you open up this file, you'll see that it contains various different products, price and quantity of those products, purchased by different customers. You can see the second column here is for customer ID. That's the customer that made the purchase. Let's get a big picture understanding of what we are about to do, first. Observe here under the DAGS folder, I have four different Python files. Now, you're already familiar with the producer pipeline for customers and the consumer pipeline for customers. Those are the DAGS that we've already…

Contents