Data Scientist
On-site · Potomac, Maryland, United States
Job Summary
Data Scientist role focused on NLP and data preparation to transform structured and unstructured data into actionable analytics. Responsibilities include conducting sophisticated NLP analyses, preprocessing and cleaning text data, designing and implementing advanced ETL and Oracle SQL data structures, authoring analytic publications and ad-hoc reports with visualizations, staying current with metadata tools, and providing subject-matter expertise in NLP to support initiatives. Requires experience with Python NLP libraries, deep learning frameworks, transformers, text classification, topic modeling, encoder-decoder models, and proficiency in SQL plus version control and GPU-accelerated computing. Collaboration may occur with a team or independently; work supports customers and senior leadership with data-driven insights.
Required Qualifications
- Demonstrated professional or academic experience performing NLP tasks, including selecting the best Python libraries for a given task, choosing appropriate pre-processing actions, performing analysis, and assessing model performance.
- Demonstrated professional or academic experience using Python NLP packages such as Spacy, Gensim, or NLTK to analyze or process collections of documents.
- Demonstrated professional or academic experience with deep learning frameworks such as PyTorch, Tensorflow, or Keras
- Demonstrated professional or academic experience with the HuggingFace Transformers library and hub.
- Demonstrated experience creating machine learning models that conduct text classification and topic modeling in Python using standard machine learning (Scikit-learn) or deep learning models.
- Demonstrated academic or professional experience using encoder-decoder and generative language models to perform NLP tasks.
- Demonstrated academic or professional experience communicating methodological choices and model results.
- Demonstrated professional or academic experience and proficiency with SQL to include using common table expressions, set operations, aggregated functions and nested subqueries.
- Demonstrated professional or academic experience with version control systems such as Github and Jenkins.
- Demonstrated experience leveraging GPUs for accelerated computing.
- Develop practical approaches for measuring performance.
- Assist in developing types of measure, the collection of data, analyzing the data, and presenting that data to senior leadership.
- Conduct advanced statistical analysis on personnel, intelligence, and performance metrics.
- Assist in selection or development of appropriate methodology to conduct research.
- Analyze information and provide research findings in a manner that is easily grasped by the customer and consumers.
Apply with one swipe on Sorce. We auto-fill applications and apply on your behalf — no cover letters, no 40-minute forms.
Hiring someone like this?
Get your role in front of qualified candidates on Sorce.