From the course: Data Pipeline Automation with GitHub Actions Using R and Python

Unlock the full course today

Join today to access over 24,700 courses taught by industry experts.

Data pipeline architecture

Data pipeline architecture

- [Host] In the previous video, we reviewed the data scope and pipeline requirements. In this video, we'll review the data pipeline architecture to automate the California sub regions demand for electricity data. We'll use the following deployment. Let's now break it down into the its different components, starting with the EIA API, our source data or raw data. In the previous chapter, we reviewed how we can set and send a gate request to pull metadata and data from the API using the EI metadata and the EI backfill functions. The pipeline supporting functions will leverage those functions to extract data from the API. The second component is the data pipeline, whose main functionality is to check if new data is available in the API and refresh the data when applicable. In addition, this function also collect metadata on each steps enabling us to monitor the health of the data pipeline. The process is deployed on GitHub actions and we'll dive into more details about the deployment in…

Contents