Remote Data Analyst - Machine Learning | WFH

Part Time Remote - US

Job Overview

We are in search of a driven and talented Machine Learning Data Analyst to become an integral part of our innovative AI & Threat Analytics team. This fully remote position provides an opportunity for flexibility, with a hybrid schedule available for candidates based in the El Dorado Hills, CA, or Chicago, IL areas. In this dynamic role, you will significantly contribute to the advancement of machine learning models through effective management, optimization, and analysis of datasets, playing a key role in our AI and threat analytics initiatives.

Key Responsibilities • Oversee the end-to-end process of data collection, cleaning, and preprocessing for datasets utilized in machine learning applications. • Employ web analysis tools to extract and structure data from DOM environments, thereby facilitating model training and validation. • Collaborate with machine learning engineers to design and execute feature engineering experiments, generating training datasets aligned with model specifications. • Develop and enhance synthetic datasets using large language models (LLMs) to ensure a diverse and well-rounded set of training data. • Utilize dimensionality reduction techniques (e.g., t-SNE, PCA, UMAP) to examine data patterns and enhance dataset quality. • Streamline data workflows through automation for efficient data processing, manipulation, and transformation. • Document data workflows, methodologies, and processes, ensuring clarity, reproducibility, and scalability. • Implement robust data validation and quality control systems to maintain consistency and integrity within datasets.

Required Skills • A minimum of 2 years of experience as a Data Analyst, preferably in a cybersecurity or machine learning context. • Proficiency in Python for data analysis and automation, with practical expertise in libraries such as Pandas and NumPy. • Strong background with web analysis tools (e.g., Selenium, BeautifulSoup) and a deep understanding of HTML and DOM structures. • Knowledge of natural language processing (NLP) techniques such as tokenization, stop word removal, and lemmatization for text data preparation. • Experience in generating synthetic datasets and employing LLMs to enhance machine learning data. • Capacity to collaborate efficiently with machine learning engineers and other technical teams. • Excellent problem-solving abilities and a meticulous approach to data quality and governance. • Familiarity with cloud platforms (AWS, GCP, Azure) for data storage and processing.

Qualifications • Bachelor's degree in Data Science, Statistics, Computer Science, or a related field, or equivalent professional experience. • Due to the nature of this role's involvement with GovCloud, all applicants must be classified as a U.S. Person.

Career Growth Opportunities

We are committed to fostering a culture of professional development and continuous learning, providing our employees with the resources needed to advance their careers while working on impactful projects in the AI and cybersecurity sectors.

Company Culture and Values

We pride ourselves on our culture of innovation and collaboration. Our team is dedicated to creating an inclusive workplace where diversity is celebrated, and all employees feel valued and supported.

Compensation And Benefits • Comprehensive medical, dental, and vision insurance (including domestic partnership coverage). • Employer-sponsored life insurance and additional options for employee/spouse/child supplemental life insurance. • Voluntary short/long-term disability insurance. • 401(k) plan with Roth and traditional investment options. • Generous paid time off (PTO) including bereavement and jury duty leave. • Competitive annual bonuses.

Employment Type: Full-Time Apply Job!