hip webinar automating integration workflow 800x100 (1)

RAS High Availability System Architect

RAS High Availability System Architect
by KenMS on 05-13-2022 at 4:35 pm

  • Full Time
  • San Jose, CA
  • Applications have closed

Website Cadence

The Cadence Hardware emulator is a complex embedded system used by many chip and system design companies to validate their multi-billion gates designs prior to, during and after the chip released to fabrication. Since the emulation platform is used by many engineers for time sensitive validation work, it must be available 24-x7 to support the development.

Cadence is the leader in hardware emulation and prototyping technology. System Engineering (SE) team is looking for a RAS High Availability System Architect to help define the next generation products and also to enhance existing platforms.

Key responsibilities

  • Own and define system availability budget and its breakdown.
  • Collaborate with other architects to develop next generation platform using various HW/SW/information redundancy techniques to achieve the desired availability goal.
  • Model and analyze the various system faults and recovery time and their impact on system availability.
  • Define and develop any necessary runtime availability monitoring tools.
  • Develop and drive integration tests to confirm the expected availability goal.
  • Transfer design, tools and test knowledge to manufacturing and field installation teams.  Review, replicate, and respond to customer issues. Perform analysis of logs from customer runs. Debug and isolate system-level issues down to various subsystems.

Requirements

  • Must have in-depth understanding of large-scale system availability and reliability design.
  • Must have strong and hands on experience with various HW/SW development and analysis tools. Information coding analysis background will be a plus.
  • Must have architecture and design experience with global clocking/synchronization, Ethernet, memory, multi-processor, optical network PMA/PCS, PCIe, AC/DC power distribution, medium to large scale resilience software development.
  • Experience with defining system reliability and availability (RAS) design specification and test plans.
  • Experience in system fault, recovery analysis and debugging issues for large scale system.
  • Bachelor or Master’s degree in EE/CompEng/Reliability engineering with at least 5 years of industry experience related to large scale system design.
Share this post via: