Maria Obedkova | NLP Engineer ๐ฉโ๐ป
NLP Engineer with expertise in ML and DL and experience working in the industry applying NLP methods to real-world problems.
๐ Important Links
๐ Experience
Senior NLP Engineer at TrustYou (Sep 2022 - present | remote and Munich, Germany)
Opinion Mining - ABSA - transformers - pytorch - API - CFG - CI/CD - MLOps - ML system design
- Improving a Transformer-based solution for ABSA and scaling it up
- Exploring approaches for multilingual ABSA
- Defining system designs for ML and DL applications
NLP Engineer at TrustYou (Apr 2020 - Aug 2022 | remote and Munich, Germany)
Opinion Mining - ABSA - transformers - pytorch - API - CFG - CI/CD
- Developed a Transformer-based solution for ABSA and put it in production
- Researed different approaches to ABSA and performed various Data Analysis tasks
- Supported and maintained the legacy system that performs Sentiment Analysis with the help of CFG
ASR - AWE - DL - Python - tensorflow - Java - Kaldi
- Researched different Deep Learning approaches of pronunciation generation for Speech Recognition
- Investigated Acoustic Word Embeddings and improved their quality for a pronunciation discrimination task
- Developed a completely new data-driven method of pronunciation generation for ASR purposes
NLP Engineer at Fact Read (Oct 2016 - Dec 2018 | remote)
Fact Extraction - Coreference Resolution - WSD - ML - Python - scikit-learn
- Implemented the morphological-syntactical pipeline and improved its quality
- Developed the anaphora resolution module for news texts using a Machine Learning approach
- Supervised linguists and coordinated the interaction of linguists and programmers in a team
Computational Linguist at ABBYY (Jan 2017 - Sep 2017 | Moscow, Russia)
NER - Unit Testing - Ontologies - Knowledge Graphs
- Developed various solutions for Fact Extraction using ABBYY tools
- Implemented unit testing for the Fact Extraction system
Computational Linguistics Intern at ABBYY Labs (Jan 2016 - Jun 2016 | Moscow, Russia)
Syntax - Tokenisation & Splitting - Python - Perl - regexp
๐ Education
MA Computational Linguistics (2017 โ 2019)
Erasmus Mundus Joint Degree with Erasmus Mundus Scholarship Award:
- Charles University - CUNI (Prague, Czech Republic)
- University of Basque Country - UPV/EHU (San Sebastian, Spain)
BA Fundamental and Applied Linguistics (2013 โ 2017)
๐ Projects
Open-Source Projects (2021-2022)
- HuggingFace Robust Speech Challenge: Developed a Speech Recognition model for Russian and open-sourced it on the HuggingFace hub. Took part in Feb 2022
- NL-Augmenter: Added a sentiment filter to the NL-Augmenter project that helps with augmentation for sentiment analysis tasks. Took part in Sep 2021
Hackathons
University Projects (2015-2019)
- Pronunciation Generation for ASR: Developed a pronunciation generation method for ASR based on AWEs. MA thesis in 2019
- Unsupervised Machine Translation: Investigated Cross-Lingual Word Embeddings for Unsupervised Neural Machine Translation for a rus-eng pair. Course project in 2018
- Russian Sketches: Developed the collocation extraction method on the basis of syntactical structure for Russian. BA thesis in 2017
- Amharic Corpus: Developed the Amharic corpus with Part-of-Speech Tagging using a Machine Learning
approach. Presented at the โConCortโ conference on Digital Humanities in 2016
- Automatic Authorship Attribution: Investigated different approaches of authorship determination and their statistical evaluation. Presented at the โDigital Humanitiesโ conference in Tartu in 2015
๐ป Skills
- Python Programming
- Data Analysis
- Machine Learning
- Deep Learning
- CI/CD
- Databases
linux - bash - git - PostgreSQL - AWS - Grafana - pandas - numpy - matplotlib - scikit-learn - tensorflow - pytorch - pyspark
In different times, I worked with:
- Sentiment Analysis
- Statistical and Neural Machine Translation
- Automated Speech Recognition
- Part-of-Speech Tagging and Parsing
- Word Sense Disambiguation
- Coreference Resolution
- Text Classification and Clusterization
- Information Retrieval
- Named Entity Recognition
- Fact Extraction
using transformer-based, statistical and rule-based approaches.
nltk - spacy - flair - huggingface libraries - nlpaug - checklist - word2vec - wav2vec2 - Kaldi - SparkNLP
Professional Activities
Communication Skills
- Russian (native)
- English (C1)
- German (B2)
- French (A1)
- Spanish (A1)
Other Interests
Neuroscience - Language theory - Travelling - Sketching - Roller skating - Yoga - Tennis - Writing - Language learning