GPU Software Engineer

San Jose, California, United StatesOnsiteContractPosted 1 month agoVisa sponsorship available

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Role: GPU Software Engineer

Location: San Jose, CA – Onsite

Duration: 6+ Months Contract-to-Hire

Overview:
We’re looking for a strong
GPU Software Engineer
to join a high‑impact engineering team working on next‑generation AI, GPU, and semiconductor technologies. This role focuses on
GPU kernel development
,
memory architecture
, and
integration with modern inference systems
such as vLLM and SGLang. You’ll work onsite in San Jose, collaborating closely with a team of engineers building high‑performance GPU‑accelerated systems.

Essential Skills:

- GPU Architecture Expertise
: In-depth understanding of GPU architectures, memory models (including HBM and P2P), and thread hierarchies.
- GPU Software Proficiency
: End-to-end knowledge of AMD/Nvidia GPU software stacks.
- Programming Experience
: Familiarity with programming models such as CUDA, ROCm/HIP, OpenCL, or MPI.
- Inference Server Experience
: Hands-on experience with inference servers like vLLM and SGLang.
- Kernel Development
: Experience developing CUDA/ROCm kernels.
- Performance Optimization
: Proven ability to optimize performance.

Technical Proficiency:

- Programming Languages:
Expert-level proficiency in C++ and Python.
- Deep Learning Frameworks:
Experience with frameworks like PyTorch or TensorFlow is a plus.

Preferred Skills:

- Networking Technologies:
Familiarity with RDMA/RoCE, InfiniBand, or Infinity Fabric is desirable.

Ready to apply?

You'll be redirected to Triune Infomatics Inc's application page.

Compensation estimateAI

Similar roles