You are currently viewing SemiWiki as a guest which gives you limited access to the site. To view blog comments and experience other SemiWiki features you must be a registered member. Registration is fast, simple, and absolutely free so please, join our community today!

Search results

  1. S

    Survey paper on Deep Learning on GPUs

    The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. GPU continues to remain the most widely used accelerator for DL applications. We present a survey of architecture and system-level techniques for optimizing DL applications on GPUs. We review 75+ techniques...
  2. S

    Survey paper on Micron's Automata Processor

    Problems from a wide variety of application domains can be modeled as ``nondeterministic finite automaton'' (NFA) and hence, efficient execution of NFAs can improve the performance of several key applications. Since traditional architectures, such as CPU and GPU are not inherently suited for...
  3. S

    Survey paper on Intel's Xeon Phi

    Intel’s Xeon Phi combines the parallel processing power of a many-core accelerator with the programming ease of CPUs. We survey ~100 works that study the architecture of Phi and use it as an accelerator for a broad range of applications. We discuss the strengths and limitations of Phi. We...
  4. S

    Survey on Neural Network on NVIDIA's Jetson Platform

    Design of hardware accelerators for neural network (NN) applications involves walking a tight rope amidst the constraints of low-power, high accuracy and throughput. NVIDIA's Jetson is a promising platform for embedded machine learning which seeks to achieve a balance between the above...
  5. S

    Survey on mobile web browsing

    Mobile web browsing (MWB) can very well be termed as the confluence of two major revolutions: mobile (smartphone) and internet revolution. Mobile web traffic has now surpassed the desktop web traffic and has become the primary means for service providers to reach-out to the billions of...
  6. S

    Survey of Spintronic Architectures for Processing-in-Memory and Neural Networks

    Spintronic memories such as STT-RAM (spin transfer torque RAM), SOT-RAM (spin orbit torque RAM) and DWM (domain wall memory) facilitate efficient implementation of PIM (processing-in-memory) approach and NN (neural network) accelerators and offer several advantages over conventional memories...
  7. S

    Survey of Data-Encoding Techniques for Reducing Data-movement Energy

    Data-movement consumes two orders of magnitude higher energy than a floating-point operation and hence, data-movement is becoming the primary bottleneck in scaling the performance of modern processors within the fixed power budget. The accelerators for deep neural networks have huge memory...
  8. S

    Survey on FPGA-based Accelerators for CNNs

    CNNs (convolutional neural networks) have been recently successfully applied for a wide range of cognitive challenges. Given high computational demands of CNNs, custom hardware accelerators are vital for boosting their performance. The high energy-efficiency, computing capabilities and...
  9. S

    Survey on DRAM reliability techniques

    Aggressive process scaling and increasing demands of performance/cost efficiency have exacerbated the incidences and impact of errors in DRAM systems. Due to this, improvements in DRAM reliability has received significant attention in recent years. Our paper surveys techniques for improving...
  10. S

    Survey of ReRAM (memristor) based Designs for Processing-in-memory and Neural Network

    As data movement operations and power-budget become key bottlenecks in processor design, interest in approaches such as processing-in-memory (PIM), machine learning, and especially neural network (NN)-based accelerators has grown significantly. Resistive RAM (ReRAM or memristor) can work as...
  11. S

    Survey paper on dynamic branch predictors

    Branch predictor (BP) is an essential component in modern processors since high BP accuracy can improve performance and reduce energy. However, reducing latency and storage overhead of BP while maintaining high accuracy presents significant challenges. W present a survey of dynamic branch...
  12. S

    Survey paper on security techniques for GPUs

    Graphics processing unit (GPU), although a powerful performance-booster, also has many security vulnerabilities. Due to these, the GPU can act as a safe-haven for stealthy malware and the weakest ‘link’ in the security ‘chain’. We present a survey of techniques for analyzing and improving GPU...
  13. S

    Survey on Techniques for Improving of Non-volatile memories

    Due to their high density and near-zero leakage power consumption, non-volatile memories (NVMs) are promising candidates for designing future memory systems. However, compared to conventional memories, NVMs also face more-severe security threats, e.g., the limited write endurance of NVMs makes...
  14. S

    Survey on SLC/MLC/TLC Hybrid SSDs

    Our survey accepted in Concurrency and Computation 2018 journal, surveys techniques for managing SSDs designed with SLC/MLC/TLC Flash memory.
  15. S

    An Open-source tool for modeling 2D/3D, SLC/MLC memories

    We have just released version 2 of DESTINY, which can model: * (2D/3D) SRAM and eDRAM * (2D/3D, SLC/MLC) STT-RAM, ReRAM and PCM * (2D, SLC/MLC) SOT-RAM, Flash, DWM (SLC/MLC = single/multi-level cell, DWM = domain wall memory, SOT-RAM = spin orbit torque RAM, STTRAM = spin transfer torque...
  16. S

    Survey paper on soft-error reliability techniques for PCM and STT-RAM

    We survey architectural techniques for improving the soft-error reliability of PCM (phase change memory) and STT-RAM (spin transfer torque RAM). We focus on soft-errors, such as resistance drift and write disturbance, in PCM and read disturbance and write failures in STT-RAM.
  17. S

    Survey papers on managing TLB and designing memories using racetrack memory

    TLB (translation lookaside buffer) caches virtual to physical address translation information and is used in systems ranging from embedded devices to high-end servers. Since TLB is accessed very frequently and a TLB miss is extremely costly, prudent management of TLB is important. Domain wall...
  18. S

    Survey papers on Cache compression, prefetching and power management

    With increasing number of on-chip cores, the size of last level cache (LLC) is on rise, e.g., Oracle's 20nm SPARC M7 processor has 64MB LLC, Intel's 22nm Xeon E5-2600 processor has 45MB LLC and IBM's 22nm POWER8 has a 96MB eDRAM LLC. In fact, 70% of the transistors in the Intel Core i3...
  19. S

    Survey Papers on Techniques for Register File in CPU and GPU

    Processor registers are useful for holding architectural state. We present survey papers on techniques for managing register file in GPUs and CPUs. We review techniques for improving performance, energy efficiency and reliability of RF and techniques for designing RF with novel memory...
  20. S

    A Survey On Cache Bypassing Techniques for CPUs, GPUs and CPU-GPU systems

    Abstract: With increasing core-count, the cache demand of modern processors has also increased. However, due to strict area/power budgets and presence of poor data-locality workloads, blindly scaling cache capacity is both infeasible and ineffective. Cache bypassing is a promising technique to...
Top