We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
Triune Infomatics Inc logo
Triune Infomatics Inc Verified
IT Consulting, Staffing, Systems Integration

GPU Software Engineer

San Jose, California, United StatesOnsiteContractPosted 10 days agoVisa sponsorship available

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Role: GPU Software Engineer

Location: San Jose, CA – Onsite

Duration: 6+ Months Contract-to-Hire

Overview:
We’re looking for a strong
GPU Software Engineer
to join a high‑impact engineering team working on next‑generation AI, GPU, and semiconductor technologies. This role focuses on
GPU kernel development
,
memory architecture
, and
integration with modern inference systems
such as vLLM and SGLang. You’ll work onsite in San Jose, collaborating closely with a team of engineers building high‑performance GPU‑accelerated systems.

Essential Skills:

- GPU Architecture Expertise
: In-depth understanding of GPU architectures, memory models (including HBM and P2P), and thread hierarchies.
- GPU Software Proficiency
: End-to-end knowledge of AMD/Nvidia GPU software stacks.
- Programming Experience
: Familiarity with programming models such as CUDA, ROCm/HIP, OpenCL, or MPI.
- Inference Server Experience
: Hands-on experience with inference servers like vLLM and SGLang.
- Kernel Development
: Experience developing CUDA/ROCm kernels.
- Performance Optimization
: Proven ability to optimize performance.

Technical Proficiency:

- Programming Languages:
Expert-level proficiency in C++ and Python.
- Deep Learning Frameworks:
Experience with frameworks like PyTorch or TensorFlow is a plus.

Preferred Skills:

- Networking Technologies:
Familiarity with RDMA/RoCE, InfiniBand, or Infinity Fabric is desirable.

Ready to apply?
You'll be redirected to Triune Infomatics Inc's application page.

Similar roles