GPU Compute Performance Engineer

Waltham, Massachusetts, United States
Software and Services

Summary

Posted:
Weekly Hours: 40
Role Number:200554934
The Apple Silicon GPU Driver Performance team is responsible for ensuring great GPU performance across our entire product line. This team is involved in all phases of the product development cycle - from working with HW teams on our GPU feature and architectural roadmaps, delivering and analyzing the performance of the latest modern GPU workloads on emerging platforms, developing our state of the art performance analysis capabilities on silicon, and helping internal and external partners to achieve the best performance possible on Apple Silicon GPUs. Members of this team possess deep technical expertise in our GPU architecture and programming models. We use this to develop workflows and tools for deep performance analysis capabilities, which we leverage to develop and optimize GPU graphics and compute workloads specifically for Apple GPUs. The team works on benchmarks, games, pro-apps, ML, GPU compute, and image processing use cases, optimizing the workloads at both the algorithm and shader level to achieve speed-of-light performance.

Description

The team is seeking extraordinary GPU and machine learning engineers who are passionate about providing robust compute solutions for accelerating machine learning workloads at both the system and GPU programming level on Apple Silicon. The ideal candidate will have a passion for squeezing the best performance possible out of our GPUs, and is able to explore the low level architectural details of the HW to achieve this. They will work closely with our GPU hardware architecture and design teams to help develop our GPU roadmap and to ensure Apple is building the right HW and SW features to make the best - and fastest - GPU products. This role’s responsibilities will include: * Working with internal partners to analyze and improve GPU and system performance of large scale ML deployments on single and multi SoC systems.. * Working with internal and external partners to optimize their GPU based ML algorithm implementations, GPU compute applications, algorithms, and shaders to achieve the best possible performance on Apple platforms. * Working with internal hardware teams to define a hardware roadmap that continues to deliver best in class GPU performance as well as performance analysis capabilities, particularly in the areas of emerging GPU accelerated ML training and inference, GPGPU use-cases and workflows * Developing tools and frameworks to support internal and external developers with performance analysis on Apple Silicon GPUs.

Minimum Qualifications

  • Experience or interest in emerging GPGPU use cases in the areas of ML and compute
  • Experience or interest in optimizing compute workloads for GPU performance is a strong plus
  • GPU programming with Metal, DirectX, Vulkan, CUDA, Direct Compute, OpenGL, or OpenCL
  • Excellent software design and problem solving skills
  • Excellent system debugging skills
  • Excellent written and oral communication skills including the ability to communicate clearly and concisely across multiple audiences to explain analytical outcomes and technical roadblocks

Key Qualifications

Preferred Qualifications

  • Experience in ML frameworks such as pytorch, tensorflow, JAX and similar is highly desired
  • Experience in GPU compute kernel optimization experience for ML training and inference operations is highly desired

Education & Experience

Additional Requirements

  • Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.