Preview Azure Data Architect

Data Engineering Basics

Starting on 1st Feb 2026

Azure Basics

Azure Data Engineering Basics

Starting on 14th Feb 2026

SQL Server

SQL

SQL Test

Data Warehousing Concepts

Python

Databases

CosmosDB

Relational Databases in Azure

Azure Data Factory

Introduction

1.0 Section Introduction

1.1 What is Data Engineering

1.2 Azure Data Factory Overview

Creating ADF

2.1 Creating Azure Free Account

2.2 Azure Portal Overview

2.3 Creating Azure Data Factory

2.4 Creating Azure Data Lake Storage Gen2

2.5 Creating Azure SQL Database

ADF Getting Started

3.1 Top Level Concepts in ADF

3.2 DataSet, Linked Services and Pipelines concept

3.3 Demo - How to create linked services, datasets and copy

Integration Runtimes

4.1 Integration Runtimes (Compute for the Pipelines)

4.2 Integration Runtime types

4.3 Data Sources Location

4.4 Task: Self hosted integration runtime

4.5 Demo - Self Hosted Integration Runtime

Control Flow Pipelines

5.1 Control Flow Activities (Get Metadata , ForEach and IF)

5.2 Demo - Control Flow Activities (Get Metadata , ForEach and IF)

5.3 Delete Activity

5.4 Demo - Delete Activity

5.5 Stored Procedure and Lookup Activities

5.6 Demo - Transform Data Activities Stored Procedure and Lookup

5.7 Switch Activity

5.8 Demo - Switch Activity

5.9 Creating Dependency and Fail activity

5.10 Demo - Creating Dependency and Fail Activity

5.11 Wait

5.12 Demo - Wait

5.13 Nested Pipelines

5.14 Demo - Nested Pipelines

5.15 Until

5.16 Demo - Until Activity

Data Flows

6.1 Data Flow - Select, Filter and Aggregate

6.2 Demo - Select, Filter and Aggeration DataFlow

6.3 Conditional Split

6.4 Demo - ConditionalSplit

6.5 Union

6.6 Demo - Union

6.7 Join

6.8 Demo - Join

6.9 Exists

6.10 Demo - Exists

6.11 Data Flow Lookup

6.12 Demo - Lookup DataFlow

6.13 Derived Column

6.14 Demo - Derived Column

6.15 Surrogate Key

6.16 Demo - Surrogate Key

6.17 Assert

6.18 AlterRow

6.19 Demo - Assert

6.20 Parse

6.21 Demo - Parse

6.22 Stringify

6.23 Demo - Stringify

6.24 Window

6.25 Demo - Window

6.26 Pivot

6.27 Demo - Pivot

6.28 Unpivot

6.29 Demo - Unpivot

6.30 Flowlets

6.31 Demo - Flowlet

6.32 Flatten

6.33 Demo - Flatten

Parameterizing the Pipelines

71. Parameterizing the Pipelines

7.2 Parameterization of Pipelines

7.3 Demo - Parametrizing Pipelines

Triggers

8.1 Triggers and Types of Triggers

8.2 Scheduled Trigger

8.3 Demo - Schedule Trigger

8.4 Storage Event Trigger

8.5 Demo - Storage Event Trigger

8.6 Tumbling Trigger

8.7 Demo - Tumbling Trigger

Auditing

9.1 Auditing

9.2 Demo - Auditing

Securing Credentials

10.1 Security Azure Key Vaults

10.2 Demo - Security Azure Key Vaults

Private Endpoint for Azure IR

11.1 Networking for ADF and Synapse Azure provided IR

11.2 Demo - Networking Azure Provided IR

CI/CD for ADF

12.1 Development flow using DevOps

12.2 Demo - DevOps Pipelines

Sharing SHIR

13.1 Sharing Self hosted Integration Runtime

13.2 Demo - Shared Self Hosted Integration Runtime

Monitoring

14.1 Monitor and Manage

Azure Databricks

1.1 What is Big Data and Time Before Big Data

1.2 What is Hadoop

1.3 Benefits of Hadoop

1.4 Archhitecture of Hadoop and HDFS

1.5 YARN Architecture and Workflow

1.6 MapReduce and it's Drawback and Challenges

Spark Internals

2.0 Introduction to Spark

2.1 Spark Terminology

2.2 Explanantion of Spark Internal Architecture

2.3 Life cycle of Spark Application

2.4 Features of Spark

2.5 Spark Eco-System

2.6 Explanation of Spark and Databricks History

2.7 Explanation of Databricks

Creating Databricks Service

3.1 Creating Databricks Community Edition Account

3.2 Creating Azure Databricks Service or Databricks Workspace

Introduction to Azure Databricks

4.0 Section Introduction

4.1 Azure Databricks Architecture

4.2 Explanantion of Cluster and It's Types

4.3 Cluster Configurations and Modes

4.4 Access Modes and Cluster Policies

4.5 Creating Azure Databricks Cluster

4.6 Azure Databricks Cluster Pools

4.7 Azure Databricks Pricing Structure

4.8 Introduction to Databricks Notebook and Creating Notebook

4.9 Introduction to Markdown and Creating Markdown

4.10 Introduction to DBFS

4.11 Databricks Utilities (dbutils)

4.12 Demo - Databricks Utilities (dbutils)

4.13 Demo 02 -Databricks Utilities (dbutils) Notebook

4.14 Hive Metastore

RDD, DataFrame and Dataset

5.0 Section Introduction

5.1 What is RDD, DataFrame and Dataset

5.2 Spark SQL Engine or Catalyst Optimizer

5.3 Reading Files in Databricks

5.4 Demo-Read Data in Spark

5.5 InferSchema

5.6 Programmatic Schema Definition

5.7 DDL Schema

5.8 Demo-Schema, Creation and Operations

5.9 Demo-Handling Corrupted Records

File Formats

6.0 Section Introduction

6.1 CSV Format

6.2 Demo-Read CSV File

6.3 JSON Format

6.4 Demo-Reading JSON File

6.5 Demo-Flatten a Nested JSON File

6.6 AVRO Format

6.7 Demo-Reading AVRO File

6.8 Parquet Format

6.9 Demo-Reading Parquet File

6.10 ORC Format

6.11 Demo-Reading ORC File

6.12 Difference between File Formats

PySpark

7.0 Section Introduction

7.1 Introduction to Transformations and its Types

7.2 What are Jobs, Stages and Tasks

7.3 Transformation Create Two DataFrame

7.4 Transformation Filter

7.5 Transformation Filter Equality

7.6 Transformation Filter AND

7.7 Transformation Filter OR

7.8 Transformation Filter Startswith and Endswith

7.9 Transformation Filter Contains

7.10 Transforamtion Filter IsNull

7.11 Transforamation Filter IsNotNull

7.12 Transforamtion Filter IsIn

7.13 Transformation Filter Inequality

7.14 Transformation Filter Like

7.15 Transformation SELECT

7.16 Transformation DROP

7.17 Transformation WithColumn

7.18 Transformation WithColumnRenamed

7.19 Transformation WithColumn Concatenate with Separator and Lit

7.20 Transforamtion Union and UnionAll

7.21 Transformation OrderBy

7.22 Transformation Sort Function

7.23 Demo-Basic Transformation Operations

7.24 Demo-Basic Transformation Operations 2

7.25 Transformation Joins

7.26 Demo-Joins

7.27 Demo-Union and UnionAll

7.27 Transformation Aggregate functions Count

7.28 Transformation GroupBy and Aggregations (SUM and AVG)

7.29 Transformation GroupBy and Aggregations (Count MIN and MAX)

7.30 Aggregate Functions

7.31 Transformation Distinct

7.32 Transformation Window Functions and it's Types

7.32 Transformation Window function Ranking Functions

7.33 Demo-Window Functions (Ranking Functions)

7.34 Transformation Window Function Analytic Functions (LAG)

7.35 Demo-Window Functions (Analytics Functions)

7.36 Transformation Window Functions Analytic Functions (Example)

7.37 Demo-Window Functions (Aggregate Functions)

7.38 Transformation Window Function Aggregate Functions (First and Last Value)

7.39 Transformation Fillna

7.40 Transformation Date and Timestamp Function 01

7.41 Transformation Date and Timestamp Function 02

7.42 Transformation Date and Timestamp Function 03

7.43 Demo-Date and Time Functions

7.44 Transformation String Manipulations 1

7.45 Transformation String Manipulations 2 (Concat)

7.46 Transformation String Manupulations 3 (Contains)

7.47 Transformation String Manipulations 4 (Startswith)

7.48 Transformation String Manipulations 5 Endswith

7.49 Transformation String Manipulations 6 (Initcap Upper and Lower)

7.49 Transformation String Manipulations 7 (Substring)

7.50 Transformation String Manipulations Length

7.51 Transformation String Manipulations (TRIM Ltrim and Rtrim)

7.52 Transformation String Manipulations (REGEX_EXTRACT)

7.53 Transformation String Manipulations (REGEX_REPLACE)

7.54 Transformation String Manipulations (RPAD)

7.55 Demo-String Manipulation

7.56 Transformation Pivot and Unpivot

7.57 Demo-Pivot and Unpivot

7.58 Transformation (Transform)

7.59 Demo-Transform Function

7.60 Transformation (Explode)

7.61 Demo-Explode Function

Widgets

8.1 Introduction to Widgets Parameterization and It's Types

8.2 Demo-Widgets

Data Lake

9.1 Introduction to Data Lake and Architecture

9.2 Introduction to Azure Storage Account

9.3 ADLS Integration with Databricks

9.4 Intro to ADLS access by Creating a Secret Scope

9.5 Demo-Access ADLS by Creating a Secret Scope

9.6 Mount Point using ADLS Access Key

9.7 Demo-Creating Mount Point using ADLS Access Key

9.8 Demo-ADLS Access by Direct Storage Key

9.9 ADLS Access By SAS Token Fourth Method

9.10 Demo-ADLS Access by SAS Token

Streaming Data

10.1 Introduction to Spark Streaming and Standard Architecture

10.2 Spark Structured Streaming Architecture

10.3 Spark Structured Streaming Internal Working

10.4 Demo-Spark Streaming

10.5 Types of Windows

10.6 Watermarking

10.7 Auto Loader Use Case

10.8 Auto Loader Architecture

10.9 Auto Loader Directory Listing Mode

10.10 Demo-Auto Loader Diretory Listing

10.11 Auto Loader File Notification Mode

10.11 Demo-File Notification Mode

10.12 Schema Evolution

10.13 Schema Evolution (AddNewColumns)

10.14 Schema Evolution (Rescue)

10.15 FailOnNewColumns

10.16 None

10.17 Schema Inference

10.18 Demo-Schema Inference and Evolution

Delta Lake

11.1 Concept of Delta Lake and it's Advantages

11.2 Difference Between Data Warehouse, Data Lake and Delta Lake

11.3 Creating Delta Table (PySpark Method)

11.4 SQL Method

11.5 DataFrame Method

11.6 Demo-Delta Create Detla Tables

11.7 Features of Delta Table

11.8 Internal Architecture of Delta Table

11.9 Demo-Delta Internal Architecture

11.10 Spark Tables

11.11 Demo-Delta Spark Tables

11.12 Schema Evolution and Enforcement

11.13 Demo-Delta Schema Evolution

11.14 Time Travel and Data Versioning

11.15 Demo-Delta Time Travel and Versioning

11.16 Optimize Command

11.17 Demo-Delta Optimize

11.18 Restore Command

11.19 Demo-Delta Restore

11.20 Vacuum Command

11.21 Demo-Delta Vacuum

11.22 Merging Data into Delta Tables

11.23 Demo-Delta Upsert

11.24 Write Modes

11.25 Demo-Delta Write Modes

11.26 Updating Delta Tables

11.27 Delete Operations

11.28 Demo-Delta Delete

Performance Optimization

12.0 Section Introduction

12.1 Join Strategy

12.2 Types of Join Strategy

12.3 Broadcast Hash Join

12.4 Shuffle Hash Join

12.5 Shuffle Sort-Merge Join

12.6 Cartesian Product Join

12.7 Broadcast Nested Loop Join

12.8 Introduction to Adaptive Query Execution and Types_backup

12.9 Dynamically Coalesce Shuffle Partition

12.10 Dynamically Switching Joins

12.11 Dynamically Optimizing Skew Join

12.12 Repartition and Coalesce

12.13 Coalesce

12.14 Introduction to Data Caching and Persist_backup

12.15 Memory_Only Storage Level

12.16 Memory_Only_Ser and Memory_And_Disk

12.17 Memory_And_Disk_Ser and Disk_Only_backup

12.18 Explanation of Shared Variables and Types

12.19 Accumulator Variable

12.20 Explanation of Spark Memory Management

12.21 Executor Memory

12.22 Spark Memory Manager and Types

12.23 Mitigation Strategies

12.24 PartitionBy

12.25 Partition Pruning and Predicate Pushdown

12.26 Dynamic Partition Pruning

WorkFlow

13.1 Introduction to Azure Databricks Workflow

13.2 Demo-Workflow

Delta Live Tables

14.1 Introduction to Delta Live Tables

14.2 Medallion Lakehouse Architecture

14.3 Delta Live Tables Architecture

14.4 Delta Live Tables Datasets (Streaming Tables)

14.5 Delta Live Tables (Materialized View)

14.6 Delta Live Tables (Views)

14.7 How to Create Delta Live Tables

14.8 When to use Datasets ST MV View

14.9 Explanation of Data Quality

14.10 Different Type Invalid Record Actions

14.11 Explanation of Change Data Capture (CDC)

Synapse

Fabric

Snowflake

Database Performance Tuning

Azure Advance for Data Architects

Building the Data Solutions

Sample Projects

Preview - Azure Data Architect