Machine Learning Engineer

Barclays
Introduction:
As Python Developer/Data Engineer, you will be responsible for creating and optimizing data and data pipeline for statistical/rule based/Machine Learning Model, writing Python & Pyspark code for ETL operation on source data, data wrangler, data profiling & aggregation for preparing model ready data. The Python developer will support our designers, data analysts and data scientists on different project initiatives and will ensure optimal project delivery. You must be self-directed and comfortable supporting the development requirement which includes creation of ML framework, component and creation of data pipeline in Python, Pyspark, Hive, MLiB for end to end (E2E) product lifecycle roadmaps. You will also act as a Data Enabler for ML use cases as well as any insights generation task
The individual would need to gain a good data understanding of all the source systems so as to support any downstream Use-Case/MI/Reporting requirements.
What will you be doing?
- Working closely with Business Analyst, designers, data scientist and understand requirement, re-build Model using MEF (Model execution framework) in Python (Pandas, Numpy & other data structures)
- Efficient in writing modular/reusable code in python by following the best coding practices
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Deploy sophisticated analytics programs, Machine Learning Model using devOps Pipeline.
- Supporting E2E project implementation starting from Requirement analysis till Go Live.
- Collaborating with other internal and supplier teams as required delivering the E2E solution
- Provide estimates & impact assessments for small / medium complexity engineering work.
- Ensuring quality and integrity throughout the various phases of the software development lifecycle.
- Review test strategies to guarantee the quality of the delivered system, assist in the diagnosis of system problems encountered during testing.
- Assist in the resolution of live incidents and system problems as and when required.
- Work closely with the Business Analyst & Data scientist and testing specialists within the team to ensure they have understood your solution.
What we’re looking for:
- Advanced working knowledge of Python (data processing/data pipeline using Pandas/Numpy/other data structures)
- Good knowledge of packaging (preparing conda build) python code
- Prominent in RDBMS, SQL & writing efficient queries
- Strong knowledge of full Hadoop/Spark, ETL, Data modelling
- Experience with big data tools: Hadoop, Spark
- Experience in performing root cause analysis on internal and external processes to answer specific business questions and identify opportunities for improvement.
- Graduate with preferred specialisation in IT, Mathematics with relevant work experience or satisfying the relevant skills (above).
Skills that will help you in the role:
- Exposure to Banking and Finance domains, as well as technology within banking/finance.
- Awareness of management information tools and systems.
- Governance and compliance policies, standards, and procedures required to fulfil role e.g. LCP On boarding
- Able to solve problems with data and to understand relevance to the wider objective.
- Experience with big data tool Kafka, etc.
Where will you be working?
- Pune
To apply for this job please visit search.jobs.barclays.
