Wei Zhang  Wei Zhang photo       

contact information

Research Staff Member
Thomas J. Watson Research Center, Yorktown Heights, NY USA
  +1dash914dash945dash2747

links



2018

AdaComp: Generalized Residual Gradient Compression for Data-Parallel Distributed Training
Chia-Yu Chen, Jungwook Choi, Daniel Brand, Ankur Agrawal, Wei Zhang, Kailash Gopalakrishnan
AAAI Conference on Artificial Intelligence (AAAI-18), 2018


2017

Decentralized Distributed Deep Learning
Wei Zhang, Xiangru Lian, Ce Zhang, Ji Liu
Workshop on AI Systems at Symposium on Operating Systems Principles (AISys at SOSP'17), 2017

Accelerator design for deep learning training
Ankur Agrawal, Chia-Yu Chen, Jungwook Choi, Kailash Gopa lakrishnan, Jinwook Oh, Sun il Shukla, Viji Srinivasan, Swagath Venkataramani, Wei Zhang
Design Automation Conference (DAC'17 invited paper), 2017

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, Ji Liu
Neural Information Processing System (NIPS'2017) *Oral Paper (40 out of 3240)*

Nexus: Bringing Efficient and Scalable Training to Deep Learning Frameworks
Yandog Wang, Li Zhang, Yufei Ren, Wei Zhang
the 25th IEEE International Symposium on the Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2017) *Best Paper Nominee*

GaDei: On Scale-up Training As A Service For Deep Learning
Wei Zhang, Minfei Feng, Yunhui Zheng, Yufei Ren, Yandong Wang, Ji Liu, Peng Liu, Bing Xiang, Li Zhang, Bowen Zhou, Fei Wang
The IEEE International Conference on Data Mining (ICDM'17), 2017

Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study
Suyo Gupta*, Wei Zhang*, Fei Wang (*Equal Contribution)
Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17) Inivited Paper, 2017


2016

Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A Systematic Study
Wei Zhang*, Suyog Gupta*, Fei Wang (*Equal contribution)
The IEEE International Conference on Data Mining 2016 (This paper recieves the Best Paper Award Runner-up), IEEE

Staleness-Aware Async-SGD for Distributed Deep Learning
Wei Zhang, Suyog Gupta, Xiangru Lian, Ji Liu
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016, pp. 2350--2356

AQuA: Adaptive Quality Analytics
Wei Zhang, Martin Hirzel, David Grove
Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems, pp. 169--180, ACM, 2016

META: Middleware for Events, Transactions, and Analytics
Matthew Arnold, David Grove, Benjamin Herta, Michael Hind, Martin Hirzel, Arun Iyengar, Louis Mandel, Vijay A. Saraswat, Avraham Shinnar, Jérôme Siméon, Mikio Takeuchi, Olivier Tardieu, Wei Zhang
IBM Journal of Research and Development 60(2-3), 2016

X10 and APGAS at Petascale
Olivier Tardieu, Benjamin Herta, David Cunningham, David Grove, Prabhanjan Kambadur, Vijay Saraswat, Avraham Shinnar, Mikio Takeuchi, Mandana Vaziri, Wei Zhang
j-TOPC 2(4), 25:1--25:32, 2016
Abstract


2015

Fixing, preventing, and recovering from concurrency bugs
DongDong Deng, GuoLiang Jin, Marc de Kruijf, Ang Li, Ben Liblit, Shan Lu, ShanXiang Qi, JingLei Ren, Karthikeyan Sankaralingam, LinHai Song, YongWei Wu, MingXing Zhang, Wei Zhang, WeiMin Zheng
Science China Information Sciences 58(5), 1--18, 2015
Abstract


2014

GLB: Lifeline-based Global Load Balancing Library in X10
Wei Zhang, Olivier Tardieu, David Grove, Benjamin Herta, Tomio Kamada, Vijay Saraswat, Mikio Takeuchi
Proceedings of the First Workshop on Parallel Programming for Analytics Applications, pp. 31--40, ACM, 2014
Abstract


2013

Efficient Concurrency-bug Detection Across Inputs
Dongdong Deng, Wei Zhang, Shan Lu
Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications, pp. 785--802, ACM

ConMem: Detecting Crash-Triggering Concurrency Bugs Through an Effect-Oriented Approach
Wei Zhang, Chong Sun, Junghee Lim, Shan Lu, Thomas Reps
ACM Trans. Softw. Eng. Methodol. 22(2), 10:1--10:33, ACM, 2013

ConAir: Featherweight Concurrency Bug Recovery via Single-threaded Idempotent Execution
Wei Zhang, Marc de Kruijf, Ang Li, Shan Lu, Karthikeyan Sankaralingam
Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 113--126, ACM, 2013


2012

Understanding the Interleaving-space Overlap Across Inputs and Software Versions
Dongdong Deng, Wei Zhang, Borui Wang, Peisen Zhao, Shan Lu
Proceedings of the 4th USENIX Conference on Hot Topics in Parallelism, pp. 17--17, USENIX Association, 2012

Automated Concurrency-bug Fixing
Guoliang Jin, Wei Zhang, Dongdong Deng, Ben Liblit, Shan Lu
Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, pp. 221--236, USENIX Association, 2012


2011

ConSeq: Detecting Concurrency Bugs Through Sequential Errors
Wei Zhang, Junghee Lim, Ramya Olichandran, Joel Scherpelz, Guoliang Jin, Shan Lu, Thomas Reps
Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 251--264, ACM, 2011

Automated Atomicity-violation Fixing
Guoliang Jin, Linhai Song, Wei Zhang, Shan Lu, Ben Liblit
Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 389--400, ACM, 2011


2010

ConMem: Detecting Severe Concurrency Bugs Through an Effect-oriented Approach
Wei Zhang, Chong Sun, Shan Lu
Proceedings of the Fifteenth Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, pp. 179--192, ACM, 2010