epam招聘中提到spark pyspark
-
About the job
Description
We are looking for a Bioinformatics Engineer to join our team in Serbia
Requirements
Familiarity with Molecular Biology data and techniques associated with:
NGS data, processing and analysis
GWAS
Genome assembly
Genome annotation
SNPs and HaplotypesStrong proficiency in SQL (PostgreSQL, MySQL, Oracle)
Solid experience with AWS (S3, EC2, Fargate, ECS, Lambda). AWS certification is a plus
Big Data and large-scale data processing:
Apache Spark (PySpark, Sparklyr)Solid working experience with Python:
Solid experience with OOP
Poetry, Pydantic
Good knowledge of ORM (sqlAlchemia)
Knowledge of any of REST-frameworks (Flask, FastAPI, Django) is a plus
Unit-testing, TDD
Packaging Python packages and publishing to PIP/Conda
AsyncIO, multiprocessing / multithreadingSolid experience with Git, knowledge of different branching strategies (Gitlfow, GitHub flow, Trunk Based Development)
Deep understanding of CI/CD process, hands-on experience with Jenkins
Hands-on experience with Docker, Docker compose
Confident Linux user, bash scriptingNice to have
RStudio ecosystem:
RStudio Pro
RStudio Package Manager
RStudio connect
CRANPython ecosystem:
PIP / Conda / Mamba package management
JupyterLabLinux / HPC:
Confident operating high-performance cluster systems via schedulers such as SLURM or Univa Grid Engine (UGE)/Sun Grid Engine (SGE)Deep learning frameworks:
PyTorch
TensorFlowWe offer
Dynamic, entrepreneurial, high speed, high growth corporate environment
Diverse multicultural, multi-functional, and multilingual work environment
Opportunities for personal and career growth in a progressive industry
Global scope, international projects
Widespread training and development opportunities
Unlimited access to LinkedIn learning solutions
Competitive salary and various benefits
Sport and social teams support, recreation area, advanced CSR programs -