João Blasques (Jonas)

João Blasques (Jonas) joaoblasques

AI-Enabled Data Engineer

Organizations

README.md

Hello, I’m João Blasques

Welcome to my professional website. I’m an AI-Enabled Data Engineer with over 5 years of experience in the tech and programming space and 1 year of experience in designing, implementing, and optimizing data pipelines and machine learning solutions.

About Me

I specialize in data engineering, artificial intelligence, and machine learning applications. My expertise includes ETL/ELT pipelines, cloud platforms (AWS, GCP, Azure), DevOps and MLOps. I believe in transforming complex data challenges into actionable insights and automated systems that drive business growth and operational efficiency.

Core Expertise

Data Engineering: ETL/ELT pipelines, data warehousing, stream processing
AI & Machine Learning: TensorFlow, PyTorch, scikit-learn, NLP
Cloud Platforms: AWS, Google Cloud Platform, Azure
Big Data Technologies: Spark, Databricks, Snowflake, Kafka, Airflow
DevOps: CI/CD, Testing, Automation, Terraform (IaC), Docker, Kubernetes

Contact

Feel free to reach out if you’d like to discuss potential collaborations, data engineering challenges, or AI implementations:

Popular posts

Data Pipeline Orchestration using Kestra
Project Overview

This repository demonstrates workflow orchestration for data engineering pipelines using Kestra. It guides users through building, running, and scheduling data pipelines that extract, transform, and load (ETL) data both locally (with PostgreSQL) and in the cloud (with Google Cloud Platform). The project is hands-on and includes conceptual explanations, infrastructure setup, and several example pipeline flows.

Key Concepts
- Workflow Orchestration: Automating and managing complex workflows with dependencies, retries, logging, and monitoring.
- Kestra: An orchestration platform with a user-friendly UI and YAML-based workflow definitions (called “flows”).
- Data Lake & Data Warehouse: Demonstrates moving data from raw storage (GCS) to structured analytics (BigQuery).
data engineering beginners tutorial docker kestra
Simple Data Pipeline

Project Overview

This repository provides a comprehensive, step-by-step guide to building a simple data engineering pipeline using containerization (Docker), orchestration (Docker Compose), and Infrastructure as Code (Terraform), with a focus on ingesting and processing NYC taxi data. The project is hands-on and includes conceptual explanations, infrastructure setup, and several example pipeline flows.

This project is a practical template for data engineers to learn and implement containerized data pipelines, local and cloud database management, and automated cloud infrastructure provisioning using modern tools like Docker, Docker Compose, and Terraform. It is especially useful for those looking to understand the end-to-end workflow from local prototyping to cloud deployment in a reproducible, automated way.

data engineering beginners tutorial docker terraform
The Role of AI in Modern Data Architectures
AI-Driven Data Architecture

Artificial intelligence isn’t just a consumer of data—it’s increasingly becoming an integral part of how we design and operate our data systems. This post explores the evolving relationship between AI and data architecture.

AI-Enhanced Data Processing

Modern data architectures are incorporating AI at various levels:
- Intelligent Data Cataloging - Automatically discovering, classifying, and tagging data assets
- Adaptive Data Integration - Using ML to identify optimal integration patterns and transformations
- Automated Quality Management - Detecting anomalies and quality issues without manual rules
- Self-Tuning Systems - Databases and data platforms that optimize themselves based on workloads
Real-World Applications

Recommendation Systems

AI algorithms help determine which data is most relevant to different users and use cases, optimizing data discovery and access.
AI data architecture innovation
Machine Learning Pipeline Design
Building Effective Machine Learning Pipelines

Creating robust machine learning pipelines is essential for deploying AI solutions at scale. This post covers key considerations and best practices.

The Anatomy of an ML Pipeline

A well-designed ML pipeline includes these key stages:
1. Data Ingestion - Collecting and importing data from various sources
2. Data Preparation - Cleaning, transforming, and feature engineering
3. Model Training - Developing and tuning ML models
4. Model Evaluation - Assessing performance and validity
5. Model Deployment - Serving models in production environments
6. Monitoring - Tracking performance and detecting drift
Common Challenges and Solutions

Challenge: Data Quality Issues

Solution: Implement robust data validation and cleaning processes early in the pipeline.
machine learning pipelines MLOps
Getting Started with Data Engineering

Data Engineering Fundamentals

Data engineering is the backbone of any data-driven organization. In this post, we will explore the fundamental concepts that every aspiring data engineer should understand.

What is Data Engineering?

Data engineering focuses on designing, building, and maintaining the infrastructure and architecture for data generation, storage, and analysis. Data engineers develop the systems that collect, manage, and convert raw data into usable information for data scientists and business analysts.

data engineering beginners tutorial
Welcome to My Professional Website

Hello, I’m João Blasques

Welcome to my professional website! I’m an AI-Enabled Data Engineer passionate about leveraging artificial intelligence and data solutions to solve complex business problems.

My Background

With expertise in data engineering, machine learning, and AI integration, I help organizations transform their data into actionable insights. I specialize in designing and implementing data pipelines, creating machine learning models, and developing AI-powered applications that drive business value.

What You’ll Find Here

On this website, you can explore:

introduction about

João Blasques (Jonas) joaoblasques

Organizations

Hello, I’m João Blasques

About Me

Core Expertise

Contact

Popular posts

Project Overview

Key Concepts

Project Overview

AI-Driven Data Architecture

AI-Enhanced Data Processing

Real-World Applications

Recommendation Systems

Building Effective Machine Learning Pipelines

The Anatomy of an ML Pipeline

Common Challenges and Solutions

Challenge: Data Quality Issues

Data Engineering Fundamentals

What is Data Engineering?

Hello, I’m João Blasques

My Background

What You’ll Find Here

Post activity