Job Summary:
We are seeking a talented Data Engineer with strong experience in Python, Hive, Hadoop, Scala, and Spark. The ideal candidate will be responsible for developing, maintaining, and optimizing data pipelines and architectures. You will work closely with cross-functional teams to ensure the efficient processing and analysis of large datasets.
Key Responsibilities:
- Design, develop, and maintain robust data pipelines and workflows.
- Utilize Python, Hive, Hadoop, Scala, and Spark for data processing and analysis.
- Implement data integration solutions and optimize performance.
- Collaborate with data scientists and analysts to understand data needs and deliver solutions.
- Ensure data quality, consistency, and reliability across systems.
- Monitor, troubleshoot, and resolve issues in data pipelines.
- Work on data warehousing solutions and ETL processes.
- Stay up-to-date with the latest industry trends and technologies.
Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 5-7 years of experience in data engineering or related roles.
- Strong proficiency in Python, Hive, Hadoop, Scala, and Spark.
- Experience with data integration and ETL processes.
- Knowledge of data warehousing concepts and solutions.
- Familiarity with cloud platforms for data storage and processing (e.g., AWS, Azure).
- Excellent problem-solving skills and attention to detail.
- Strong communication and collaboration skills.
Preferred Skills:
- Experience with other big data technologies like Kafka, Flunk or Presto.
- Knowledge of machine learning frameworks and libraries.
- Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Understanding of DevOps practices and CI/CD pipelines.
Bachelors
Hadoop,Hive,Python,scala,Spark,
IT-Software- Software services