Python Developers - US

Remote Full-time
Work Location: Remote, within the US Engagement Model: Freelancer/Independent Contractor Start Date: ASAP DataForce by TransPerfect is looking for skilled Python Developers to architect, build, and own the data pipelines that power large language model (LLM) development. Your primary mission will be to build scalable, automated systems that transform massive raw datasets into clean, model-ready formats. While your focus will be on data engineering, your expertise will also be valuable in collaborating on model training runs and experiments. You are a strong fit for this role if you are a Python expert who thrives on solving large-scale data challenges and enjoys working at the intersection of data engineering and machine learning. Role Responsibilities • Design, develop, and own robust, scalable, and automated ETL/ELT pipelines in Python to ingest and process terabyte-scale text datasets. • Implement rigorous data cleaning, deduplication, filtering, and normalization strategies, and define and enforce data quality standards to ensure high integrity for model training. • Efficiently structure and format diverse datasets (e.g., JSON, Parquet) for consumption by LLM training frameworks. • Work closely with AI researchers and ML engineers to understand data requirements, define metrics, and support the model training lifecycle. • Continuously optimize data processing workflows for performance, cost efficiency, and reliability. • Occasionally assist with launching, monitoring, and debugging data-related issues during model training runs. Role Requirements • 5–10 years of professional experience in Python development, data engineering, data processing, or backend software engineering. • Expert-level proficiency in Python and its data ecosystem (e.g., Pandas, NumPy, Dask, Polars). • Proven experience building and maintaining large-scale data pipelines. • Deep understanding of data structures, data modeling, and software engineering best practices (Git, CI/CD, testing). • Experience handling and parsing diverse data formats (JSON, CSV, XML, Parquet) at scale. • Excellent problem-solving skills and a meticulous attention to detail. • Strong communication and collaboration skills, with experience working in a team environment. Preferred Role Requirements • Hands-on experience with the data preprocessing pipeline for an LLM (e.g., LLaMA, BERT, GPT-family). • Experience with big data frameworks like Apache Spark or Ray. • Experience with Hugging Face libraries (Transformers, Datasets, Tokenizers). • Familiarity with ML frameworks like PyTorch or TensorFlow. • Proficiency with cloud platforms (AWS, GCP, Azure) and their data/storage services. DataForce by TransPerfect is part of the TransPerfect family of companies, the world’s largest provider of language and technology solutions for global business, with offices in more than 100 cities worldwide. We offer high-quality data for Human-Machine Interaction to some of the most prestigious technology companies in the world. Our department focuses on gathering, enriching and processing data for Machine Learning in different AI domains. To learn more about DataForce please visit us at TransPerfect provides equal employment opportunity to all individuals regardless of their race, color, creed, religion, gender, age, sexual orientation, national origin, disability, veteran status, or any other characteristic protected by state, federal, or local law. For more information on the TransPerfect Family of Companies, please visit our website at www.transperfect.com. Remote About the Company: DataForce by TransPerfect Apply tot his job
Apply Now →

Similar Jobs

Remote - Service Now QA Automation Test Engineer $80/hr Srinivasa Kandi

Remote Full-time

Remote QA Automation Engineer (High Paying) – Shopify & AWS Serverless All Gold Commerce All In[...]

Remote Full-time

Quality Automation Engineer- ERP Development Team (Remote)

Remote Full-time

Senior Software QA Engineer

Remote Full-time

Remote - Automation Test Engineer | QA Engineer | Quality Tester

Remote Full-time

QA Engineer – Anywhere

Remote Full-time

VR QA Tester & Community Relations (Contract – Remote, Part-Time)

Remote Full-time

Sr. Software Quality Assurance Engineer

Remote Full-time

Functional Quality Assurance Engineer

Remote Full-time

QA Engineer at The San Francisco Standard Remote

Remote Full-time

Part Time Sales (16 and 17 Years Old) – arenafl...

Remote Full-time

**Experienced Customer Engagement Marketing Specialist – Part-Time Remote Opportunity at arenaflex**

Remote Full-time

**Experienced Customer Experience Manager – Home Improvement Retail Industry**

Remote Full-time

CRO Specialist

Remote Full-time

Prime Vendor Analyst

Remote Full-time

Boarding Specalist

Remote Full-time

Regulatory Publishing Specialist (Advanced Word Formatting)

Remote Full-time

**Experienced Insurance Customer Service Representative – Work from Home Opportunity with arenaflex**

Remote Full-time

SMB Account Executive (Remote, North America)

Remote Full-time

Experienced Data Entry Specialist for E-commerce Product Database Management – Remote Work Opportunity with arenaflex

Remote Full-time
← Back to Home