
QA Engineer - Code Review Expert
Role summary
This is a short-term, remote contract role for a QA Engineer with expertise in code review, partnering with a top AI research organization. The primary responsibility is to evaluate user-AI coding conversations by reviewing transcripts, analyzing the AI's logic and actions, and scoring them using a detailed rubric. Ideal candidates are senior or staff engineers with deep code review experience, QA engineers with strong verification skills, or technical writers. Proficiency in Python is helpful, and familiarity with various programming languages, Git, testing frameworks, and debugging tools is a plus. This task-based engagement offers flexibility and weekly payments.
Partnering with a top AI research organization to evaluate and improve how coding assistants reason, act, and communicate during development workflows. We’re seeking technically sharp experts (especially those with experience in code review, testing, or documentation) to assess full transcripts of user-AI coding conversations. This short-term engagement helps shape the future of developer-assisting AI systems.
Key Responsibilities
- Review long-form transcripts between users and AI coding assistants
- Analyze the AI’s logic, execution, and stated actions in detail
- Score each transcript using a 10-point rubric across multiple criteria
- Optionally write brief justifications citing examples from the dialogue
- Detect mismatches between claims and actions (e.g., saying “I’ll run tests” but not doing so)
Ideal Qualifications
Top choices:
- Senior or Staff Engineers with deep code review experience and execution insight
- QA Engineers with strong verification and consistency-checking habits
- Technical Writers or Documentation Specialists skilled at comparing instructions vs. implementation
Also a Strong Fit
- Backend or Full-Stack Developers comfortable with function calls, APIs, and test workflows
- DevOps or SRE professionals familiar with tool orchestration and system behavior analysis
Languages And Tools
- Proficiency in Python is helpful (most transcripts are Python-based)
- Familiarity with other languages like JavaScript, TypeScript, Java, C++, Go, Ruby, Rust, or Bash is a plus
- Comfort with Git workflows, testing frameworks, and debugging tools is valuable
More About the Opportunity
- Must complete each transcript batch within 5 hours of starting (unlimited tasks to be done)
- Flexible, task-based engagement with potential for recurring batches
Application Process
- Submit your resume to begin
- If selected, you’ll receive rubric documentation and access to the evaluation platform
- Most applicants hear back within a few business days
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.
Contract and Payment Terms
- You will be engaged as an independent contractor.
- This is a fully remote role that can be completed on your own schedule.
- Projects can be extended, shortened, or concluded early depending on needs and performance.
- Your work at will not involve access to confidential or proprietary information from any employer, client, or institution.
- Payments are weekly on Stripe or Wise based on services rendered.