# L2 Data Engineer - Remote

**Company:** [DeepSource Technologies](http://jobs.workable.com/companies/bSz5yPYQuy7zGF2AaBwdRp.md)
**Location:** Remote
**Workplace:** remote
**Employment type:** Full-time

[Apply for this job](http://jobs.workable.com/view/c7b5bf80-8c72-4b7f-bb0c-d85ac6aa9122)

## Description

We are looking for a highly skilled and experienced L2 Data Engineer to join our growing Data & Analytics team. In this role, you will lead the design, development, optimization, and maintenance of scalable enterprise data platforms and cloud-native data solutions. You will work closely with architects, analysts, and business stakeholders to build high-performance data pipelines and modern lakehouse solutions that support advanced analytics, reporting, and data-driven decision-making.

This opportunity is ideal for a senior data professional with strong hands-on expertise in Databricks and the Microsoft Azure ecosystem, who is passionate about building reliable, scalable, and optimized data platforms in enterprise environments.

**KEY RESPONSIBILITIES**

• Design, develop, and optimize enterprise-scale data pipelines and ETL/ELT workflows using Azure and Databricks technologies.

• Architect and implement scalable data ingestion, transformation, and orchestration processes using Azure Data Factory, Databricks, and Azure Synapse Analytics.

• Develop high-performance data transformation frameworks using PySpark, Python, and Spark SQL for large-scale distributed data processing.

• Optimize SQL queries, Spark jobs, and data workflows to improve performance, scalability, and cost efficiency.

• Lead data migration initiatives, including SQL Server migrations and modernization of legacy data platforms.

• Implement and maintain Delta Lake architecture, incremental data loading strategies, and enterprise data lake best practices.

• Collaborate with architects and cross-functional teams to design robust and scalable data models aligned with business and governance standards.

• Monitor and troubleshoot production pipelines, perform root-cause analysis, and implement preventive measures for recurring issues.

• Support CI/CD implementation and infrastructure automation for data engineering workflows.

• Mentor junior engineers and contribute to engineering standards, reusable frameworks, and technical best practices.

• Create and maintain technical documentation including architecture diagrams, pipeline documentation, and operational runbooks.

• Evaluate and recommend modern data engineering tools, frameworks, and optimization strategies.

## Requirements

5+ years of professional experience in Data Engineering or related roles.

• Strong expertise in Python for enterprise data processing, transformation, and automation.

• Advanced hands-on experience with Pandas, PySpark, and Spark SQL for large-scale distributed processing.

• Strong experience with Databricks, including cluster management, notebook development, workflow orchestration, Delta Lake, and performance optimization.

• Extensive experience building and managing enterprise data pipelines using Azure Data Factory.

• Strong working knowledge of Azure Synapse Analytics, particularly Spark pool integration and enterprise data warehousing concepts.

• Advanced SQL skills including query optimization, performance tuning, indexing strategies, and troubleshooting.

• Strong understanding of data lake architecture, Delta Lake, incremental processing, partitioning, and lakehouse concepts.

• Experience implementing data governance, security, access controls, and monitoring within cloud data platforms.

• Experience handling production support, troubleshooting, and optimization of enterprise data platforms.

**NICE TO HAVE**

• Experience with Terraform for Azure infrastructure provisioning and Infrastructure-as-Code (IaC).

• Experience implementing CI/CD pipelines for data engineering deployments.

• Exposure to Lakehouse Federation, Delta Sharing, and modern data sharing architectures.

• Experience with streaming and near real-time data processing solutions.

• Knowledge of DevOps practices and cloud cost optimization strategies.

**CERTIFICATION REQUIREMENT**

Candidates are expected to hold or be actively working toward the Databricks Certified Data Engineer Professional certification. This certification validates advanced expertise across the following domains:

• Advanced ETL and ELT development using Spark SQL and PySpark

• Enterprise-grade pipeline orchestration and optimization

• Data modeling and scalable lakehouse architecture

• Performance tuning and distributed data processing optimization

• Advanced data governance and security implementation

• Production-grade data engineering practices within the Databricks ecosystem[](https://www.databricks.com/sites/default/files/2025-11/databricks-certified-data-engineer-professional-exam-guide-november-30-2025_0.pdf)
