Hidden
Middle+ Data Scientist (Python)

Remote
Full-time

Moderation Review

In the archive

Brief description of the vacancy

WaveAccess is looking for a Data scientist to make the team even stronger. The work on the project will be related to Real-World Pharmaceutical Data and will involve characterizing patient treatment patterns and outcomes using electronic health records and health insurance claims data.

About the company

Company WaveFccess

WaveAccess is an international results-driven company that provides high-quality custom software development services for hundreds of emerging and established companies globally. By supporting customers with talented software engineers and also vast experience in advanced technologies, WaveAccess builds innovative software solutions while minimizing development risks and costs.

Throughout its 22-year history, the company’s highly skilled specialists have implemented over 500 successful projects for market leaders, ambitious startups, and government institutions.

Responsibilities

  • Engineer features from sequence data for downstream tasks 
  • Build sequence models using techniques like RNNs, CNNs, attention, transformers  
  • Fine-tune transformer models and implement custom model architectures as needed 
  • Build NLP pipelines for tasks like text classification, named entity recognition, question answering 
  • Monitor model performance and implement improvements to increase accuracy 
  • Incorporate multi-modal data such as text, numerics, images, and genomics 
  • Working with large language models, prompt tuning

Requirements

  • Commercial development experience
  • English - B2
  • Experience using Microsoft Office, such as PowerPoint, Excel, and Word 
  • Ability to communicate effectively both verbal and written format 
  • Ability to work on multiple projects while executing on key deliverables within required timeframes 
  • Ability to write and debug scripts in Python, as well as familiarity with common Python packages (e.g., numpy, pandas, sci-kit learn, tensorflow/keras/pytorch) 
  • Deep knowledge of Neural Networks and architectures for working with sequences, in particular (RNN, LSTM, Transformers, CNN, attention).
  • Understanding of linear algebra
  • Experience with AWS (EC2, S3, SageMaker)

Optional

  • Knowledge of general Machine Learning approaches
  • Knowledge of mathematical statistics.
  • Understanding of CI/CD

Platforms

Windows or Linux with knowledge of shell commands

Programming Languages

Python

Methods/Frameworks

  • Required
  • Numpy
  • Keras / TF / PyTorch
  • transformers
  • pandas
  • matplotlib/ seaborn/ yellowbrick
  • Desirable
  • Flask / Django
  • scikit-learn
  • Imblearn
  • Scipy

Tools (infrastructure)

  • Git (GitHub, Bitbucket, GitLab)
  • Jira (or alternative - YouTrack / Trello / Redmine / GitHub Project)
  • Confluence (or alternative - GitHub Wiki / BookStack / Document360)
  • IDE (or alternative - PyCharm / VSCode)
  • Jupyter (Jupyter Notebook, Jupyter Lab, Jupyter Hub)
  • Basic knowledge of SQL

Optional

  • flake8 or other code linter
  • Docker
  • Precommit
  • Snowflake

Working conditions

  • High white and annually indexed salary
  • Employment according to labor laws, 100% payment of sick leave and vacation
  • Voluntary medical insurance (VMI) with dental coverage
  • Work using flexible development methodology (Agile/Scrum)
  • Flexible start of the working day
  • Weekly seminars, participation in conferences and meetups, and payment for certification exams

Contacts

Our website uses cookies, including web analytics services. By using the website, you consent to the processing of personal data using cookies. You can find out more about the processing of personal data in the Privacy policy