AI Inference Engineer Intern – Model Pruning
Website Quadric
Responsibilities:
Model pruning: Prune the model to speed up inference with re-training to maintain accuracy.
Requirements
- MS student in CS or related fields.
- Proficiency in Python
- Experience with model pruning and training in PyTorch
- Experience in quantization, and vision model accuracy metrics.
Benefits
At Quadric, we value Integrity, Humility, and Happiness. What we expect from one another is simple and clear: Initiative, Collaboration, and Completion. We are a collaborative team focused on building something extraordinary in the edge computing space.
The hourly rate for this temporary internship position is $45.00/hour to $60.00/hour. The actual rate offered will depend on a number of factors, including the specific level of the role, years and depth of relevant experience and education, technical skills and competencies, and work location.
Quadric interns receive hands-on experience working alongside industry experts in AI and semiconductor technology, with access to mentorship and meaningful project ownership from day one.
Apply for job
To view the job application please visit apply.workable.com.



Available Is Not In Control: Balancing Output, Quality, and Risk in High-Volume Fabs