Responsibilities:
• Work on the collecting, storing, processing, and analysing of large sets of data
• Choose optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
• Responsible for integrating those solutions with the architecture used across the company and to help build out some core services that power Machine Learning and analytics systems.
Role Requirements:
• Lead and work closely with all teams (including virtual teams based in non UK locations), creating a strong culture of transparency and collaboration.
• Ability to process and rationalise structured data, message data and semi/unstructured data and ability to integrate multiple large data sources and databases into one system.
• Proficient understanding of distributed computing principles and of the fundamental design principles behind a scalable application.
• Strong knowledge of the Big Data eco system, experience with Hortonworks/Cloudera platforms.
• Strong self-starter with strong technical skills who enjoys the challenge of delivering change within tight deadlines.
• Knowledge of one or more of the following domains (including market data vendors):
Party/Client
Trade
Settlements
Payments
Instrument and pricing
Market and/or Credit Risk
• Practical expertise in developing applications and using querying tools on top of Hive, Spark (PySpark)
• Experience/ knowledge on Scala skills preferred.
• Experience in Python, particularly the Anaconda environment and Python based ML model deployment.
• Experience of Continuous Integration/Continuous Deployment (Jenkins/Hudson/Ansible).
• Experience in working in Teams using the Agile Methods (SCRUM) and Confluence/JIRA.
• Good communication skills (written and spoken), ability to engage with different stakeholders and to synthesise different opinions and priorities.
• Knowledge of at least one Python web framework (preferably: Flask, Tornado, and/or twisted).
• Basic understanding of front-end technologies, such as JavaScript, HTML5, and CSS3 would be a plus.
• Good understanding of global markets, markets macrostructure and macro economics.
• Knowledge of Elastic Search Stack (ELK).
• Experience processing and rationalising structured data, message data and semi/unstructured data and integrating multiple large data sources and databases into one system.
• Knowledge of distributed computing principles and of the fundamental design principles behind a scalable application.
• Experience using:
o Hortonworks/Cloudera platforms.
o HDFS.
o Querying tools on top of Hive, Spark (PySpark).
o Python, particularly the Anaconda environment.
o GIT/GITLAB as a version control system.
• Good communication skills (written and spoken), ability to engage with different stakeholders and to synthesise different opinions and priorities.
• Good knowledge of SDLC and formal Agile processes, a bias towards TDD and a willingness to test products as part of the delivery cycle.
Bachelors
Any Bachelors Degree
Big Data,Data Engineering,Pyspark,Python,Hive,Spark,Hortonworks/Cloudera,GIT/GITLAB,
IT-Software- Software services