
Opensource Al workload Software Engineer
Role summary
AMD is seeking an influential Software Engineer to join a core team focused on optimizing AI workloads on AMD GPUs. This role involves improving performance on open-source LLM repositories like vLLM and SGLang, co-optimizing AI workloads by analyzing and mitigating kernel-level bottlenecks, and integrating AMD's software stacks (RoCm, ATen) into popular frameworks such as PyTorch, JAX, and Triton. The ideal candidate will have a strong motivation, a "Just-Do-It" mindset, and hands-on experience with state-of-the-art LLM inference and training. Preferred experience includes deep AI infrastructure knowledge, strong kernel optimization skills using DSLs and HIP/CUDA, and familiarity with modern GPU architecture. Demonstrated open-source contributions are also highly valued.
Overview:
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
Responsibilities:
THE ROLE:
AMO is looking for an influential engineer to optimize Al workload running on AMO GPUs. You will be a founding member of a core team of exceptionally talented industry specialists to define the future of Al computing solutions.
THE PERSON:
Ideal candidates should possess a "Just-Do-It" mindset and strong motivation. They must be driven to understand the status quo and explore better solutions using the first principles of thinking. Hands-on experience with state-of-the-art LLM inference and training at both the framework and kernel levels is highly preferred.
KEY RESPONSIBILITIES:
Improve AMO GPU performance on open-source repositories for LLM workloads such as vLLM and SGLang.
Co-optimize Al workloads on current AMO GPUs by analyzing the bottlenecks and mitigating them at the kernel level.
Integrate AMO software stacks (RoCm, ATen) intoopen-sourceframeworks such as Pytorch, JAX, and Triton.
Build strong technical relationships with peers and partners, and report learnings and gaps to GPU software and hardware engineers.
PREFERRED EXPERIENCE:
Deep Al infrastructure experience with open-source frameworks (e.g., SGLang, vLLM, Jax, XLA, Pytorch, Triton).
Strong kernel optimization skills using DSLs and HIP (or CUDA), plus PTX/SASS equivalents.
Hands-on knowledge of modern GPU architecture.
Demonstrated open-source contributions on GitHub.
Motivational leadership and excellent interpersonal skills.
ACADEMIC CREDENTIALS
Bachelor's, Master's, or Ph.D. in Computer Engineering, Computer Science, Electrical Engineering, or a related technical field
#LI-EV1
#LI-HYBRID
Qualifications:
*Benefits offered are described:* AMD benefits at a glance.
*AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants’ needs under the respective laws throughout all stages of the recruitment and selection process.*
*AMD may use Artificial Intelligence to help screen, assess or select applicants for this position. AMD’s “Responsible AI Policy” is available* *here.*
*This posting is for an existing vacancy.*
Sample AMD interview questions
- 1
Develop a service for managing distributed locking.
system designmedium - 2
Merge a New Interval Merge a new interval into a list of non-overlapping intervals. Input: intervals = [[1,2],[3,5],[6,7],[8,10],[12,16]], newInterval = [4,8] Output: [[1,2],[3,10],[12,16]] Explanation: The new interval overlaps with [3,5], [6,7], and [8,10], merging them all into the unified block [3,10].
codingmedium - 3
Aggressive Cows Maximize the minimum distance between aggressive cows in stalls. Input: stalls = [0,4,3,7,10,9], cows = 3 Output: 4 Explanation: Placing the cows at positions 0, 4, and 10 yields a maximum possible minimum distance of 4 between any two cows.
codingmedium - 4
Longest Consecutive Sequence Determine the length of the longest consecutive elements sequence. Input: nums = [0,3,7,2,5,8,4,6,0,1] Output: 9 Explanation: The longest consecutive sequence is 0 through 8 (length 9), utilizing a hash set to check connectivity in linear time.
codingmedium - 5
Reverse Nodes in k-Group Reverse nodes in k-group in a linked list. Input: head = [1,2,3,4,5], k = 3 Output: [3,2,1,4,5] Explanation: The first 3 elements are reversed, while the remaining 2 are left untouched since they don't form a complete group.
codingmedium
Sign up for a personalized interview prep pack tailored to this role.