Rashed Z. Bhatti  Rashed Z. Bhatti photo         

contact information

Senior Computer Engineer: Big Data Hardware Acceleration
Almaden Research Center, San Jose, CA, USA



Dr. Rashed Bhatti is a Senior Computer Software/Hardware Engineer at IBM Research. He started working and collaborating with IBM Thomas J. Watson Research Center in 2005 while he was still in graduate school doing his PhD. He specializes in High Performance Computer (HPC) architecture and system design, VLSI/ASIC design and verification, and hardware/software integration. His deep understanding of computer architectures and semiconductor devices, from the most basic transistor level to the highest level of hardware design and software stack, makes him a man of extraordinarily rare scientific skills.

In 2006, Dr. Bhatti co-architected the on-chip network of the first generation of polymorphic computing processor chip MONARCH (Morphable Networked micro-ARCHitecture), while working at Information Sciences Institute (ISI) in collaboration with Raytheon. After completing his PhD in 2007, Dr. Bhatti taught master’s level classes of VLSI/ASIC design and verification for about three year as adjunct faculty at University of Southern California.

Working with IBM Thomas J. Watson Research Center, Dr. Bhatti remained involved in a series of IBM’s massively parallel supercomputing projects. On these projects he co-architected the design, worked on the logic and Physical Design (PD) implementations of the core CPU chips; the development of various serdes semiconductor devices and their controlors; the development and implementation of software layers and hardware layers of supercomputers; and (as important in practice as his work on research and development) the testing and bringing up of very large supercomputer systems.

In 2012, Dr Bhatti finally decided to join IBM Research for full time to work on Hardware Acceleration of real time Big Data Analytics problems in InfoSphere Streams. His most recent research focus is on scalable platform independent implementations of high performance algorithms using OpenCL targeted to heterogeneous computing platforms composed of variety of CPU, AVX, Intel PHI, GPUs, and FPGAs.