Job Responsibilities:
Build Data-Centric's full-modal large model data closed loop, from pre-training data, instruction data to live network return data, consolidate data science and engineering practice, and enhance Xiaoyi's product competitiveness and user experience. The work content includes:
1. Data technology pre-research: Insight into the evolution trend of large model data technology, solve the medium- and long-term problems of large model data, data problems include but are not limited to: data course learning, multimodal data alignment, large model Agent data capabilities, instruction data theory and practice, etc.;
2. Data cleaning and quality evaluation: formulate detailed, comprehensive, and executable full-modal data acquisition, cleaning and quality standards, build a data cleaning engineering platform, classify training data, establish a sub-classification data quality and optimization system, and solve data subject distribution, content compliance and other issues;
3. Data labeling: Responsible for the full-modal data labeling platform, exchange Design and build the interface, iterate data annotation tools, annotation standards and quality inspection standards, and continuously improve the quality and efficiency of difficult data annotation; ...

ai llm data scientist.

job details

the application process.

apply with randstad.

we'll give you a call.

getting you registered.

compliance check.

reference and background check.

the perfect job for you.

the interview.

start your new job.

related jobs.

data director

ai产品经理

ai operation manager

let similar jobs come to you