Posted: Jan 21, 2026
About the position Responsibilities • Translate business, modeling and ethical requirements in Health AI into executable data collection projects. • Design human-in-the-loop evaluation tasks to measure the performance and usability of models in the medical domain. • Develop the materials necessary to execute successful data collection efforts such as guidelines, annotation interfaces, quality assurance workflows. • Support the sourcing and/or creation of high-quality language datasets and language artifacts for feature and language expansion. • Analyze structured and unstructured data to provide actionable recommendations to improve data quality or model performance. • Iterate and innovate on data collection methodologies to improve data turnaround time and reliability. • Incorporate LLMs, prompt engineering, and ML techniques to automate repetitive annotation and data creation workflows. Requirements • 2+ years of data scientist experience. • 3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience. • PhD in a language and human behavior related field with a strong quantitative component (e.g., Cognitive Linguistics, Sociolinguistics, Human-Computer Interaction); or, a Master's degree with 3+ years of field experience. • Experience in data mining and cleaning for NLP machine learning model pipelines. • Experience in language data collection for quantitative analysis, including guidelines, workflow design. • Experience in research and experimental design involving human participants. • Experience in statistical measures for data quality assessment and research hypotheses testing. • Practical knowledge of data labeling tools and techniques (e.g., Amazon SageMaker Ground Truth, brat, ELAN). • Excellent knowledge of semantics, pragmatics, conversation analysis, and/or discourse analysis. • Ability to explain complex concepts and solutions in easy-to-understand terms. Nice-to-haves • Experience with LLMs and prompt engineering techniques and other programmatic approaches to annotation, including weak supervision and active learning. • Practical knowledge of version control systems (e.g. Git). • Experience with spoken data collection, speech analysis, speech transcription (from scratch or ASR-assisted). • Experience working with clinical or medical data, such as medical transcriptions, clinical notes, or electronic health records (EHRs). • Knowledge of healthcare terminology and medical ontologies (e.g., SNOMED CT, ICD, RxNorm). Benefits • Medical, financial, and/or other benefits including equity and sign-on payments. • Flexible working culture to support work-life balance. • Mentorship and career growth resources. • Employee-led affinity groups fostering a culture of inclusion. Apply tot his job
Interactive - Sr Creative TechnologistDec 7, 2025Experienced Coach and Operations Manager Trainee - Retail Leadership Development Program in Macon, GADec 29, 2025Intake Care Senior Representative (36326756)Dec 26, 2025Finance Associate: Accounts Receivable SpecialistDec 14, 2025Tax Preparer - Fully Remote - 2+Yrs Paid Tax Experience Required NEJan 8, 2026Payroll Processor (Weekly Payroll)Jan 7, 2026Project Finance Closing SpecialistDec 8, 2025Media Manager - Amazon MarketplaceMar 5, 2026Warehouse Worker – Yonkers, NYFeb 16, 2026Associate Category Manager, AmazonDec 8, 2025Data Scientist, Amazon Advertising Job at Amazon in SeattleJan 9, 2026Amazon Flex Delivery – Prescott Valley, AZDec 2, 2025Amazon Package Delivery Driver - Earn $15.00/hrDec 28, 2025Senior Engagement Manager, AWS Professional ServicesJan 26, 2026Performance PPC Lead – Amazon and GoogleDec 21, 2025