Data Pipeline Development:
Design and implement ETL (Extract, Transform, Load) processes to move data from various sources into data warehouses or lakes.
Ensure data pipelines are scalable, efficient, and fault-tolerant.
Database Management:
Develop and maintain relational and NoSQL databases.
Optimize database performance and ensure data integrity.
Data Integration:
Integrate data from multiple sources, including APIs, flat files, and third-party services.
Ensure seamless data flow across systems.
Data Quality Assurance:
Implement data validation and cleansing processes to maintain high data quality.
Monitor data pipelines for errors and resolve issues promptly.
Collaboration:
Work closely with data scientists, analysts, and business stakeholders to understand data requirements.
Provide support for data exploration and analysis.
Documentation:
Maintain comprehensive documentation of data pipelines, schemas, and processes.
Ensure compliance with data governance and security policies.
Health & Wellness:
Comprehensive health insurance (medical, dental, vision).
Mental health support and wellness programs.
Financial:
Provident Fund (PF), Gratuity, and other retirement benefits.
Performance-based bonuses and stock options.
Work-Life Balance:
Paid time off (PTO), sick leave, and national holidays.
Flexible working hours and remote work options.
Professional Development:
Opportunities for certifications (e.g., AWS Certified Big Data – Specialty, Google Professional Data Engineer).
Access to training programs, workshops, and conferences.