arrow_back
Data Engineering Basics
Starting on 1st Feb 2026
Azure Basics
Azure Data Engineering Basics
Starting on 14th Feb 2026
SQL Server
SQL
SQL Test
Data Warehousing Concepts
Python
Databases
CosmosDB
Relational Databases in Azure
Azure Data Factory
Introduction
1.0 Section Introduction
1.1 What is Data Engineering
1.2 Azure Data Factory Overview
Creating ADF
2.1 Creating Azure Free Account
2.2 Azure Portal Overview
2.3 Creating Azure Data Factory
2.4 Creating Azure Data Lake Storage Gen2
2.5 Creating Azure SQL Database
ADF Getting Started
3.1 Top Level Concepts in ADF
3.2 DataSet, Linked Services and Pipelines concept
3.3 Demo - How to create linked services, datasets and copy
Integration Runtimes
4.1 Integration Runtimes (Compute for the Pipelines)
4.2 Integration Runtime types
4.3 Data Sources Location
4.4 Task: Self hosted integration runtime
4.5 Demo - Self Hosted Integration Runtime
Control Flow Pipelines
5.1 Control Flow Activities (Get Metadata , ForEach and IF)
5.2 Demo - Control Flow Activities (Get Metadata , ForEach and IF)
5.3 Delete Activity
5.4 Demo - Delete Activity
5.5 Stored Procedure and Lookup Activities
5.6 Demo - Transform Data Activities Stored Procedure and Lookup
5.7 Switch Activity
5.8 Demo - Switch Activity
5.9 Creating Dependency and Fail activity
5.10 Demo - Creating Dependency and Fail Activity
5.11 Wait
5.12 Demo - Wait
5.13 Nested Pipelines
5.14 Demo - Nested Pipelines
5.15 Until
5.16 Demo - Until Activity
Data Flows
6.1 Data Flow - Select, Filter and Aggregate
6.2 Demo - Select, Filter and Aggeration DataFlow
6.3 Conditional Split
6.4 Demo - ConditionalSplit
6.5 Union
6.6 Demo - Union
6.7 Join
6.8 Demo - Join
6.9 Exists
6.10 Demo - Exists
6.11 Data Flow Lookup
6.12 Demo - Lookup DataFlow
6.13 Derived Column
6.14 Demo - Derived Column
6.15 Surrogate Key
6.16 Demo - Surrogate Key
6.17 Assert
6.18 AlterRow
6.19 Demo - Assert
6.20 Parse
6.21 Demo - Parse
6.22 Stringify
6.23 Demo - Stringify
6.24 Window
6.25 Demo - Window
6.26 Pivot
6.27 Demo - Pivot
6.28 Unpivot
6.29 Demo - Unpivot
6.30 Flowlets
6.31 Demo - Flowlet
6.32 Flatten
6.33 Demo - Flatten
Parameterizing the Pipelines
71. Parameterizing the Pipelines
7.2 Parameterization of Pipelines
7.3 Demo - Parametrizing Pipelines
Triggers
8.1 Triggers and Types of Triggers
8.2 Scheduled Trigger
8.3 Demo - Schedule Trigger
8.4 Storage Event Trigger
8.5 Demo - Storage Event Trigger
8.6 Tumbling Trigger
8.7 Demo - Tumbling Trigger
Auditing
9.1 Auditing
9.2 Demo - Auditing
Securing Credentials
10.1 Security Azure Key Vaults
10.2 Demo - Security Azure Key Vaults
Private Endpoint for Azure IR
11.1 Networking for ADF and Synapse Azure provided IR
11.2 Demo - Networking Azure Provided IR
CI/CD for ADF
12.1 Development flow using DevOps
12.2 Demo - DevOps Pipelines
Sharing SHIR
13.1 Sharing Self hosted Integration Runtime
13.2 Demo - Shared Self Hosted Integration Runtime
Monitoring
14.1 Monitor and Manage
Azure Databricks
1.1 What is Big Data and Time Before Big Data
1.2 What is Hadoop
1.3 Benefits of Hadoop
1.4 Archhitecture of Hadoop and HDFS
1.5 YARN Architecture and Workflow
1.6 MapReduce and it's Drawback and Challenges
Spark Internals
2.0 Introduction to Spark
2.1 Spark Terminology
2.2 Explanantion of Spark Internal Architecture
2.3 Life cycle of Spark Application
2.4 Features of Spark
2.5 Spark Eco-System
2.6 Explanation of Spark and Databricks History
2.7 Explanation of Databricks
Creating Databricks Service
3.1 Creating Databricks Community Edition Account
3.2 Creating Azure Databricks Service or Databricks Workspace
Introduction to Azure Databricks
4.0 Section Introduction
4.1 Azure Databricks Architecture
4.2 Explanantion of Cluster and It's Types
4.3 Cluster Configurations and Modes
4.4 Access Modes and Cluster Policies
4.5 Creating Azure Databricks Cluster
4.6 Azure Databricks Cluster Pools
4.7 Azure Databricks Pricing Structure
4.8 Introduction to Databricks Notebook and Creating Notebook
4.9 Introduction to Markdown and Creating Markdown
4.10 Introduction to DBFS
4.11 Databricks Utilities (dbutils)
4.12 Demo - Databricks Utilities (dbutils)
4.13 Demo 02 -Databricks Utilities (dbutils) Notebook
4.14 Hive Metastore
RDD, DataFrame and Dataset
5.0 Section Introduction
5.1 What is RDD, DataFrame and Dataset
5.2 Spark SQL Engine or Catalyst Optimizer
5.3 Reading Files in Databricks
5.4 Demo-Read Data in Spark
5.5 InferSchema
5.6 Programmatic Schema Definition
5.7 DDL Schema
5.8 Demo-Schema, Creation and Operations
5.9 Demo-Handling Corrupted Records
File Formats
6.0 Section Introduction
6.1 CSV Format
6.2 Demo-Read CSV File
6.3 JSON Format
6.4 Demo-Reading JSON File
6.5 Demo-Flatten a Nested JSON File
6.6 AVRO Format
6.7 Demo-Reading AVRO File
6.8 Parquet Format
6.9 Demo-Reading Parquet File
6.10 ORC Format
6.11 Demo-Reading ORC File
6.12 Difference between File Formats
PySpark
7.0 Section Introduction
7.1 Introduction to Transformations and its Types
7.2 What are Jobs, Stages and Tasks
7.3 Transformation Create Two DataFrame
7.4 Transformation Filter
7.5 Transformation Filter Equality
7.6 Transformation Filter AND
7.7 Transformation Filter OR
7.8 Transformation Filter Startswith and Endswith
7.9 Transformation Filter Contains
7.10 Transforamtion Filter IsNull
7.11 Transforamation Filter IsNotNull
7.12 Transforamtion Filter IsIn
7.13 Transformation Filter Inequality
7.14 Transformation Filter Like
7.15 Transformation SELECT
7.16 Transformation DROP
7.17 Transformation WithColumn
7.18 Transformation WithColumnRenamed
7.19 Transformation WithColumn Concatenate with Separator and Lit
7.20 Transforamtion Union and UnionAll
7.21 Transformation OrderBy
7.22 Transformation Sort Function
7.23 Demo-Basic Transformation Operations
7.24 Demo-Basic Transformation Operations 2
7.25 Transformation Joins
7.26 Demo-Joins
7.27 Demo-Union and UnionAll
7.27 Transformation Aggregate functions Count
7.28 Transformation GroupBy and Aggregations (SUM and AVG)
7.29 Transformation GroupBy and Aggregations (Count MIN and MAX)
7.30 Aggregate Functions
7.31 Transformation Distinct
7.32 Transformation Window Functions and it's Types
7.32 Transformation Window function Ranking Functions
7.33 Demo-Window Functions (Ranking Functions)
7.34 Transformation Window Function Analytic Functions (LAG)
7.35 Demo-Window Functions (Analytics Functions)
7.36 Transformation Window Functions Analytic Functions (Example)
7.37 Demo-Window Functions (Aggregate Functions)
7.38 Transformation Window Function Aggregate Functions (First and Last Value)
7.39 Transformation Fillna
7.40 Transformation Date and Timestamp Function 01
7.41 Transformation Date and Timestamp Function 02
7.42 Transformation Date and Timestamp Function 03
7.43 Demo-Date and Time Functions
7.44 Transformation String Manipulations 1
7.45 Transformation String Manipulations 2 (Concat)
7.46 Transformation String Manupulations 3 (Contains)
7.47 Transformation String Manipulations 4 (Startswith)
7.48 Transformation String Manipulations 5 Endswith
7.49 Transformation String Manipulations 6 (Initcap Upper and Lower)
7.49 Transformation String Manipulations 7 (Substring)
7.50 Transformation String Manipulations Length
7.51 Transformation String Manipulations (TRIM Ltrim and Rtrim)
7.52 Transformation String Manipulations (REGEX_EXTRACT)
7.53 Transformation String Manipulations (REGEX_REPLACE)
7.54 Transformation String Manipulations (RPAD)
7.55 Demo-String Manipulation
7.56 Transformation Pivot and Unpivot
7.57 Demo-Pivot and Unpivot
7.58 Transformation (Transform)
7.59 Demo-Transform Function
7.60 Transformation (Explode)
7.61 Demo-Explode Function
Widgets
8.1 Introduction to Widgets Parameterization and It's Types
8.2 Demo-Widgets
Data Lake
9.1 Introduction to Data Lake and Architecture
9.2 Introduction to Azure Storage Account
9.3 ADLS Integration with Databricks
9.4 Intro to ADLS access by Creating a Secret Scope
9.5 Demo-Access ADLS by Creating a Secret Scope
9.6 Mount Point using ADLS Access Key
9.7 Demo-Creating Mount Point using ADLS Access Key
9.8 Demo-ADLS Access by Direct Storage Key
9.9 ADLS Access By SAS Token Fourth Method
9.10 Demo-ADLS Access by SAS Token
Streaming Data
10.1 Introduction to Spark Streaming and Standard Architecture
10.2 Spark Structured Streaming Architecture
10.3 Spark Structured Streaming Internal Working
10.4 Demo-Spark Streaming
10.5 Types of Windows
10.6 Watermarking
10.7 Auto Loader Use Case
10.8 Auto Loader Architecture
10.9 Auto Loader Directory Listing Mode
10.10 Demo-Auto Loader Diretory Listing
10.11 Auto Loader File Notification Mode
10.11 Demo-File Notification Mode
10.12 Schema Evolution
10.13 Schema Evolution (AddNewColumns)
10.14 Schema Evolution (Rescue)
10.15 FailOnNewColumns
10.16 None
10.17 Schema Inference
10.18 Demo-Schema Inference and Evolution
Delta Lake
11.1 Concept of Delta Lake and it's Advantages
11.2 Difference Between Data Warehouse, Data Lake and Delta Lake
11.3 Creating Delta Table (PySpark Method)
11.4 SQL Method
11.5 DataFrame Method
11.6 Demo-Delta Create Detla Tables
11.7 Features of Delta Table
11.8 Internal Architecture of Delta Table
11.9 Demo-Delta Internal Architecture
11.10 Spark Tables
11.11 Demo-Delta Spark Tables
11.12 Schema Evolution and Enforcement
11.13 Demo-Delta Schema Evolution
11.14 Time Travel and Data Versioning
11.15 Demo-Delta Time Travel and Versioning
11.16 Optimize Command
11.17 Demo-Delta Optimize
11.18 Restore Command
11.19 Demo-Delta Restore
11.20 Vacuum Command
11.21 Demo-Delta Vacuum
11.22 Merging Data into Delta Tables
11.23 Demo-Delta Upsert
11.24 Write Modes
11.25 Demo-Delta Write Modes
11.26 Updating Delta Tables
11.27 Delete Operations
11.28 Demo-Delta Delete
Performance Optimization
12.0 Section Introduction
12.1 Join Strategy
12.2 Types of Join Strategy
12.3 Broadcast Hash Join
12.4 Shuffle Hash Join
12.5 Shuffle Sort-Merge Join
12.6 Cartesian Product Join
12.7 Broadcast Nested Loop Join
12.8 Introduction to Adaptive Query Execution and Types_backup
12.9 Dynamically Coalesce Shuffle Partition
12.10 Dynamically Switching Joins
12.11 Dynamically Optimizing Skew Join
12.12 Repartition and Coalesce
12.13 Coalesce
12.14 Introduction to Data Caching and Persist_backup
12.15 Memory_Only Storage Level
12.16 Memory_Only_Ser and Memory_And_Disk
12.17 Memory_And_Disk_Ser and Disk_Only_backup
12.18 Explanation of Shared Variables and Types
12.19 Accumulator Variable
12.20 Explanation of Spark Memory Management
12.21 Executor Memory
12.22 Spark Memory Manager and Types
12.23 Mitigation Strategies
12.24 PartitionBy
12.25 Partition Pruning and Predicate Pushdown
12.26 Dynamic Partition Pruning
WorkFlow
13.1 Introduction to Azure Databricks Workflow
13.2 Demo-Workflow
Delta Live Tables
14.1 Introduction to Delta Live Tables
14.2 Medallion Lakehouse Architecture
14.3 Delta Live Tables Architecture
14.4 Delta Live Tables Datasets (Streaming Tables)
14.5 Delta Live Tables (Materialized View)
14.6 Delta Live Tables (Views)
14.7 How to Create Delta Live Tables
14.8 When to use Datasets ST MV View
14.9 Explanation of Data Quality
14.10 Different Type Invalid Record Actions
14.11 Explanation of Change Data Capture (CDC)
Synapse
Fabric
Snowflake
Database Performance Tuning
Azure Advance for Data Architects
Building the Data Solutions
Sample Projects
Preview - Azure Data Architect
Discuss (
0
)
navigate_before
Previous
Next
navigate_next