— Python3 + classic ML/CV stack (numpy, pandas, sklearn, opencv)
— Deep Learning frameworks: PyTorch/TensorFlow (PyTorch is preferred)
— theoretical concepts of Machine Learning and Deep Learning.
— 3+ years of DS/ML experience (at least 2 years of CV experience)
— practical experience in at least two of the following problems: image classification, object detection, segmentation, OCR, metrics learning
— experience in deploying ML models to production
— good English for reading SOTA articles and communicating with foreign colleagues
— goal-oriented mindset
— Practical experience with ML models in production (2+ years);
— Python 3.х, software engineering skills (able to produce well-structured production-level projects, not only notebook scripts);
-- Experience with ML frameworks and libraries: numpy, pandas, nltk, PyTorch, scikit-learn, spaCy, gensim LSTM, transformers (BERT, GPT);
— Good knowledge of ML theory and practice — pros & cons of different model types, validation, metrics, etc.;
-- Understanding theoretical concepts of NLP (language modeling, text classification, sequence classification, question answering, etc);
Languages: Python, JS
Libraries/Frameworks: Keras, Tensorflow, PyTorch, sklearn, numpy, pandas, spaCy, PySpark, Vue.js и др
Distributed Computing: Spark
Cloud Platform: AWS (AWS ML)
CI/CD: Docker, Kubernetes
DB: PostgreSQL, MongoDB и др.
высшее или неоконченное высшее образование в IT или физико-математическом направлении;
знание основ статистики и теории вероятностей;
знание основных методов кластеризации и классификации;
знание и базовый опыт программирования на Python;
знание библиотек для работы с данными numpy, pandas;
знакомство с фреймвёрками TensorFlow или PyTorch;
опыт работы с базами данных и знание языка SQL;
английский на уровне чтения и понимания технической документации.
• Knowledgeable with 3+ years of relevant industry experience and advanced degree in machine learning, computer science, statistics, biostatistics, mathematics, or related quantitative field
• Proven track record of shipping machine learning-powered algorithm products at B2C-like scale as well as working with cross-functional teams in an agile-like environment
• Your grasp of machine learning fundamentals and ability to design intuitive, working ML solutions in response to complex business problems
• You have a strength in the “design and prototype” part of the ML development pipeline, beginning with pulling datasets from SQL and ending with serializing ML models and assisting engineers to product-ionize model retraining and model serving systems
• When it comes to communicating, you have no problem with ML/algorithm designs clearly to cross-functional team members, especially engineers and product managers
• You are well versed in SQL data warehouses such as Redshift and Snowflake, have worked on current ML tools such as TensorFlow, PyTorch, and Python, and feel comfortable with recommender systems or natural language processing
• To take it one step further, you are effective at translating and blending traditionally distinct ML concepts such as recommender systems, NLP, regression, andиclassification into a common framework such as TensorFlow
От 3 лет опыта разработки ПО с использованием технологии машинного обучения на Python или другом языке программирования
Опыт работы с ML-моделями на всех этапах
Экспертное знание алгоритмов
Знание PyTorch, Tensorflow
Понимание архитектур нейросетей для различных областей (Text-to-Speech, Behavior Prediction, NLP и другие)
Знание английского на уровне B1 и выше
Strong analytical and data interpretation skills
Work experience with Python (Pandas, NumPy, bs4, Selenium, Sklearn, SciPy, Keras)
Experience with scraping and data cleaning
Understating and experience in developing ML algorithms (regression, classification, neural networks )
Strong background in statistics, probability theory, and linear algebra
Good understanding of model scoring, results interpretation, and further usage