
Backend Engineer (Go) – AI Systems & Code Quality | Remote
Role summary
Seeking a Backend Engineer with Go expertise and 5+ years of experience for a remote, hourly contract role. This position focuses on evaluating AI systems, specifically LLM-generated code and responses, for accuracy, reasoning, and quality. Responsibilities include fact-checking, executing code for validation, annotating model outputs, and assessing code quality. The ideal candidate will have a strong software engineering background, experience with LLMs, and the ability to solve complex algorithmic problems. Experience with model evaluation, RLHF, or open-source contributions is highly valued.
Position:
Software Engineering, Data Science, and Systems Design Experts, Go (5+ YOE)
Type:
Hourly contract
Compensation:
$60-$100 per hour
Location:
Remote
Commitment:
Full-time or Part-time Contract Work
Role Responsibilities
- Evaluate LLM-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness
- Conduct fact-checking using trusted public sources and authoritative references
- Conduct accuracy testing by executing code and validating outputs using appropriate tools
- Annotate model responses by identifying strengths, areas of improvement, and factual or conceptual inaccuracies
- Assess code quality, readability, algorithmic soundness, and explanation quality
- Ensure model responses align with expected conversational behavior and system guidelines
- Apply consistent evaluation standards by following taxonomies, benchmarks, and detailed evaluation guidelines
Requirements
- BS, MS, or PhD in Computer Science or a closely related field
- Real-world experience in software engineering or related technical roles
- Expertise in Go programming language
- Ability to solve HackerRank or LeetCode Medium and Hard–level problems independently
- Experience contributing to open-source projects, including merged pull requests
- Experience using LLMs while coding and understanding their strengths and failure modes
- Strong attention to detail and ability to evaluate complex technical reasoning and identify bugs or logical flaws
- Prior experience with RLHF, model evaluation, or data annotation work preferred
- Track record in competitive programming preferred
- Experience reviewing code in production environments preferred
- Familiarity with multiple programming paradigms or ecosystems preferred
- Experience explaining complex technical concepts to non-expert audiences preferred
Application Process (Takes 20 Mins)
- Upload resume
- Interview (15 min)
- Submit form