
Senior Software Engineer (Go) – AI Code Evaluation | Remote
Role summary
Seeking a Senior Software Engineer with Go expertise and 5+ years of experience for an hourly contract role focused on AI code evaluation. The position involves assessing LLM-generated responses to coding queries for accuracy, reasoning, and clarity. Responsibilities include executing code, validating outputs, annotating model responses, and ensuring adherence to evaluation standards. Candidates must have strong Go programming skills, experience with LLMs, and the ability to solve complex coding problems. Experience with RLHF, model evaluation, or open-source contributions is highly valued. This is a remote, full-time or part-time contract position.
Position:
Software Engineering, Data Science, and Systems Design Experts, Go (5+ YOE)
Type:
Hourly contract
Compensation:
$60-$100 per hour
Location:
Remote
Commitment:
Full-time or Part-time Contract Work
Role Responsibilities
- Evaluate LLM-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness
- Conduct fact-checking using trusted public sources and authoritative references
- Conduct accuracy testing by executing code and validating outputs using appropriate tools
- Annotate model responses by identifying strengths, areas of improvement, and factual or conceptual inaccuracies
- Assess code quality, readability, algorithmic soundness, and explanation quality
- Ensure model responses align with expected conversational behavior and system guidelines
- Apply consistent evaluation standards by following taxonomies, benchmarks, and detailed evaluation guidelines
Requirements
- BS, MS, or PhD in Computer Science or a closely related field
- Real-world experience in software engineering or related technical roles
- Expertise in Go programming language
- Ability to solve HackerRank or LeetCode Medium and Hard–level problems independently
- Experience contributing to open-source projects, including merged pull requests
- Experience using LLMs while coding and understanding their strengths and failure modes
- Strong attention to detail and ability to evaluate complex technical reasoning and identify bugs or logical flaws
- Prior experience with RLHF, model evaluation, or data annotation work preferred
- Track record in competitive programming preferred
- Experience reviewing code in production environments preferred
- Familiarity with multiple programming paradigms or ecosystems preferred
- Experience explaining complex technical concepts to non-expert audiences preferred
Application Process (Takes 20 Mins)
- Upload resume
- Interview (15 min)
- Submit form