# Dan He

## contact information

Computational Genomics

Thomas J. Watson Research Center, Yorktown Heights, NY USA

+19149452315

## links

### Professional Associations

**Professional Associations:**ACM | IEEE Member | International Society for Computational Biology

**2016**

Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction

He, Dan and Kuhn, David and Parida, Laxmi

*Bioinformatics**32*(*12*), i37--i43, Oxford Univ Press, 2016
Mint: Mutual information based transductive feature selection for genetic trait prediction

He, Dan and Rish, Irina and Haws, David and Parida, Laxmi

*IEEE/ACM Transactions on Computational Biology and Bioinformatics**13*(*3*), 578--583, IEEE, 2016**2015**

Does encoding matter? A novel view on the quantitative genetic trait prediction problem

He, Dan and Parida, Laxmi

*Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on*,*pp. 123--126*
SAME: a sampling-based multi-locus epistasis algorithm for quantitative genetic trait prediction

He, Dan and Parida, Laxmi

*Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics*,*pp. 286--295*, 2015
Performance evaluation of different encoding strategies for quantitative genetic trait prediction

Ogundijo, Oyetunji E and He, Dan and Parida, Laxmi

*Computational Advances in Bio and Medical Sciences (ICCABS), 2015 IEEE 5th International Conference on*,*pp. 1--6*
Mined: An efficient mutual information based epistasis detection method to improve quantitative genetic trait prediction

He, Dan and Wang, Zhanyong and Parada, Laxmi

*International Symposium on Bioinformatics Research and Applications*,*pp. 108--124*, 2015
Variable-Selection Emerges on Top in Empirical Comparison of Whole-Genome Complex-Trait Prediction Methods

Haws, David C and Rish, Irina and Teyssedre, Simon and He, Dan and Lozano, Aurelie C and Kambadur, Prabhanjan and Karaman, Zivan and Parida, Laxmi

*PloS one**10*(*10*), e0138903, Public Library of Science, 2015
Data-driven encoding for quantitative genetic trait prediction

He, Dan and Wang, Zhanyong and Parida, Laxmi

*BMC bioinformatics**16*(*Suppl 1*), S10, BioMed Central Ltd, 2015**2014**

IPED2: Inheritance path based pedigree reconstruction algorithm for complicated pedigrees

He, Dan and Wang, Zhanyong and Parida, Laxmi and Eskin, Eleazar

*Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics*,*pp. 202--210*, 2014**2013**

IPEDX: An exact algorithm for pedigree reconstruction using genotype data

He, Dan and Eskin, Eleazar

*Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on*,*pp. 517--520*
Optimized retrieval algorithms for personalized content aggregation

He, Dan and Parker, Douglass S

*Information Reuse and Integration (IRI), 2013 IEEE 14th International Conference on*,*pp. 270--277*
Leveraging multi-SNP reads from sequencing data for haplotype inference

Yang, Wen-Yun and Hormozdiari, Farhad and Wang, Zhanyong and He, Dan and Pasaniuc, Bogdan and Eskin, Eleazar

*Bioinformatics*, btt386, Oxford Univ Press, 2013
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data

Yang, Wen-Yun and Hormozdiari, Farhad and Wang, Zhanyong and He, Dan and Pasaniuc, Bogdan and Eskin, Eleazar

*Bioinformatics**29*(*18*), 2245--2252, Oxford Univ Press, 2013
IBD-Groupon: an efficient method for detecting group-wise identity-by-descent regions simultaneously in multiple individuals based on pairwise IBD relationships

He, Dan

*Bioinformatics**29*(*13*), i162--i170, Oxford Univ Press, 2013
IPED: inheritance path-based pedigree reconstruction algorithm using genotype data

He, Dan and Wang, Zhanyong and Han, Buhm and Parida, Laxmi and Eskin, Eleazar

*Journal of Computational Biology**20*(*10*), 780--791, Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, NY 10801 USA, 2013**2012**

Hap-seqX: Expedite Algorithm for Haplotype Phasing with Imputation using Sequence Data

D. He, E. Eskin

*Gene*, Elsevier, 2012
Modeling semantic influence for biomedicai research topics using MeSH hierarchy

D. He

*Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on*,*pp. 1--6*
Hap-seq: An Optimal Algorithm for Haplotype Phasing with Imputation Using Sequencing Data

D. He, B. Han, E. Eskin

*Research in Computational Molecular Biology*,*pp. 64--78*, 2012**2011**

Mining research topic-related influence between academia and industry

D. He

*Machine Learning and Knowledge Discovery in Databases*, 17--31, Springer, 2011
CLAP: Collaborative pattern mining for distributed information systems

X. Zhu, B. Li, X. Wu, D. He, C. Zhang

*Decision Support Systems*, Elsevier, 2011
How Does Research Evolve? Pattern Mining for Research Meme Cycles

D. He, X. Zhu, D.S. Parker

*Data Mining (ICDM), 2011 IEEE 11th International Conference on*,*pp. 1068--1073*
Genotyping common and rare variation using overlapping pool sequencing

D. He, N. Zaitlen, B. Pasaniuc, E. Eskin, E. Halperin

*BMC Bioinformatics**12*(*Suppl 6*), S2, BioMed Central Ltd, 2011
MINING APPROXIMATE REPEATING PATTERNS FROM SEQUENCE DATA WITH GAP CONSTRAINTS

D. He, X. Zhu, X. Wu

*Computational Intelligence**27*(*3*), 336--362, Wiley Online Library, 2011
Using HLA binding prediction algorithms for epitope mapping in HIV vaccine clinical trials

D. He, P. Kunwar, E. Eskin, H. Horton, P. Gilbert, T. Hertz

*Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine*,*pp. 594--601*, 2011
Efficient algorithms for tandem copy number variation reconstruction in repeat-rich regions

D. He, F. Hormozdiari, N. Furlotte, E. Eskin

*Bioinformatics**27*(*11*), 1513--1520, Oxford Univ Press, 2011
Learning the funding momentum of research projects

D. He, D. Parker

*Advances in Knowledge Discovery and Data Mining*, 532--543, Springer, 2011
An optimal weighted aggregated association test for identification of rare variants involved in common diseases

J.H. Sul, B. Han, D. He, E. Eskin

*Genetics**188*(*1*), 181--188, Genetics Soc America, 2011
Topical semantics of twitter links

M.J. Welch, U. Schonfeld, D. He, J. Cho

*Proceedings of the fourth ACM international conference on Web search and data mining*,*pp. 327--336*, 2011**2010**

Rule synthesizing from multiple related databases

D. He, X. Wu, X. Zhu

*Advances in Knowledge Discovery and Data Mining*, 201--213, Springer, 2010
Effective algorithms for fusion gene detection

D. He, E. Eskin

*Algorithms in Bioinformatics*, 312--324, Springer, 2010
Detection and reconstruction of tandemly organized de novo copy number variations

D. He, N. Furlotte, E. Eskin

*BMC bioinformatics**11*(*Suppl 11*), S12, BioMed Central Ltd, 2010
Optimal algorithms for haplotype assembly from whole-genome sequence data

D. He, A. Choi, K. Pipatsrisawat, A. Darwiche, E. Eskin

*Bioinformatics**26*(*12*), i183--i190, Oxford Univ Press, 2010
Topic dynamics: an alternative model of bursts in streams of topics

D. He, D.S. Parker

*Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining*,*pp. 443--452*, 2010**2009**

Approximate repeating pattern mining with gap requirements

D. He, X. Zhu, X. Wu

*Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on*,*pp. 17--24*
Error detection and uncertainty modeling for imprecise data

D. He, X. Zhu, X. Wu

*Tools with Artificial Intelligence, 2009. ICTAI'09. 21st International Conference on*,*pp. 792--795***2008**

Cleansing noisy data streams

X. Zhu, P. Zhang, X. Wu, D. He, C. Zhang, Y. Shi

*Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on*,*pp. 1139--1144***2007**

Iterative Refinement of Repeat Sequence Specification Using Constrained Pattern Matching

D. He, A.N. Arslan, Y. He, X. Wu

*Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on*,*pp. 1199--1203*
A novel greedy algorithm for the minimum common string partition problem

D. He

*Bioinformatics Research and Applications*, 441--452, Springer, 2007
SAIL-APPROX: An efficient on-line algorithm for approximate pattern matching with wildcards and length constraints

D. He, X. Wu, X. Zhu

*Bioinformatics and Biomedicine, 2007. BIBM 2007. IEEE International Conference on*,*pp. 151--158***2006**

Ontology-Based FeatureWeighting for Biomedical Literature Classification

D. He, X. Wu

*Information Reuse and Integration, 2006 IEEE International Conference on*,*pp. 280--285*
Using suffix tree to discover complex repetitive patterns in DNA sequences

D. He

*Engineering in Medicine and Biology Society, 2006. EMBS'06. 28th Annual International Conference of the IEEE*,*pp. 3474--3477*
A fast algorithm for the Constrained Multiple Sequence Alignment problem

D. He, A.N. Arslan, A.C.H. Ling

*Acta Cybernetica**17*(*4*), 701--717, Acta Cybernetica, 2006**2005**

A parallel algorithm for the Constrained Multiple Sequence Alignment problem

D. He, A.N. Arslan

*Bioinformatics and Bioengineering, 2005. BIBE 2005. Fifth IEEE Symposium on*,*pp. 258--262*
A space-efficient algorithm for the constrained pairwise sequence alignment problem

D. He, A.N. Arslan

*Genome Informatics**16*(*2*), 237--246, 2005**Year Unknown**

Space-efficient Algorithms for the Constrained Multiple Sequence Alignment Problem

D. He, A.N. Arslan

D. He, A.N. Arslan

FastPCMSA: An Improved Parallel Algorithm for the Constrained Multiple Sequence Alignment Problem

D. He, A.N. Arslan

D. He, A.N. Arslan

A* Algorithms for the Constrained Multiple Sequence Alignment Problem

D. He, A.N. Arslan

D. He, A.N. Arslan

