Data Engineer | ETL Specialist | Azure Databricks Expert
Detail-oriented Data Engineer with 3.7 years of experience specializing in data ingestion, transformation, and processing using Python, PySpark, SQL, and Azure Databricks. I have hands-on involvement in support roles for data access management and recently enhanced my skills in Pandas, NumPy, and FastAPI for data manipulation and analysis.
Passionate about building scalable data pipelines and solving complex data challenges in cloud environments.
Built an end-to-end real-time data processing pipeline that ingests streaming data from multiple sources, transforms it using PySpark, and loads it into Azure Data Lake for analytics. The pipeline handles millions of records daily with high reliability and performance.
Designed and implemented an automated reporting system using Python and FastAPI that generates customized reports from multiple data sources. The system reduced manual reporting time by 70% and improved data accuracy through automated validation checks.
Developed a comprehensive data quality framework that validates incoming data against business rules and schema definitions. The framework provides automated alerts for data anomalies and maintains detailed audit logs for compliance purposes.
Created a high-performance API gateway using FastAPI with asynchronous endpoints to handle data requests from multiple client applications. Implemented dependency injection patterns and comprehensive error handling for improved maintainability and reliability.