Organizations

3 results for Beginners
  • Project Overview

    This repository demonstrates workflow orchestration for data engineering pipelines using Kestra. It guides users through building, running, and scheduling data pipelines that extract, transform, and load (ETL) data both locally (with PostgreSQL) and in the cloud (with Google Cloud Platform). The project is hands-on and includes conceptual explanations, infrastructure setup, and several example pipeline flows.


    Key Concepts

    • Workflow Orchestration: Automating and managing complex workflows with dependencies, retries, logging, and monitoring.
    • Kestra: An orchestration platform with a user-friendly UI and YAML-based workflow definitions (called “flows”).
    • Data Lake & Data Warehouse: Demonstrates moving data from raw storage (GCS) to structured analytics (BigQuery).
  • Project Overview

    This repository provides a comprehensive, step-by-step guide to building a simple data engineering pipeline using containerization (Docker), orchestration (Docker Compose), and Infrastructure as Code (Terraform), with a focus on ingesting and processing NYC taxi data. The project is hands-on and includes conceptual explanations, infrastructure setup, and several example pipeline flows.

    This project is a practical template for data engineers to learn and implement containerized data pipelines, local and cloud database management, and automated cloud infrastructure provisioning using modern tools like Docker, Docker Compose, and Terraform. It is especially useful for those looking to understand the end-to-end workflow from local prototyping to cloud deployment in a reproducible, automated way.

  • Data Engineering Fundamentals

    Data engineering is the backbone of any data-driven organization. In this post, we will explore the fundamental concepts that every aspiring data engineer should understand.

    What is Data Engineering?

    Data engineering focuses on designing, building, and maintaining the infrastructure and architecture for data generation, storage, and analysis. Data engineers develop the systems that collect, manage, and convert raw data into usable information for data scientists and business analysts.

    data engineering beginners tutorial Created Mon, 05 May 2025 09:30:00 +0100