W. Scott Spangler  W. Scott Spangler photo       

contact information

Principal Data Scientist - Watson Health, Discovery
Almaden Research Center, San Jose, CA, USA


Professional Associations

Professional Associations:  ACM SIGKDD

more information

More information:  My book: Mining the Talk



Recent Work

I am currently the Principal Data Scientist Watson Health, Discovery Technology. Our group develops the Watson for Drug Discovery solution.  In my role as Data Scientist I am leading the application of Watson technology to Accelerate Discovery.  Some recent customer enagements include:

  • A project with Baylor College of Medicine to aid in the discovery of new potential cancer therapies centered around the P53 protein. 
  • Work with Sanofi on using Watson technology for Drug Repurposing
  • Work with Pfizer on discovering combination therapies for Immuno-Oncology
  • Working Barrow Neurological Institute to find new targets for ALS.


Over the past twenty years I have developed several applications for mining unstructured information, including eClassifier, Service Request Analyzer, and STAT (Simple Text Analysis Tool). This taxonomy editing technology was also included in the Lotus Discovery Server application in 2001. Following this I led the development of the Business Insights Workbench tool (BIW) for analyzing structured and unstructured information. BIW has been used on several external client engagements with Pfizer and Novartis, and internally has been used to help the IBM T&IP (Trademark and Intellectual Property) department better leverage our patent portfolio. I also developed tools for and been actively involved in text analysis of internal IBM Jams (ValuesJam, WorldJam) and external client Jams (HabitatJam, NokiaJam). Most recently I have been a principle architect of the COBRA system for Corporate Brand Reputation Analysis which was developed during a customer engagement with the MARS snackfood corporation. My most recent work is leading the SIIP (Strategic IP Insights Platform) for patent analytics and DeepQA for Life Sciences.

The best description of my work is contained in the book, Mining the Talk, which describes our unique text mining methodology along with customer engagement experiences using the Unstructured Information Modeling approach.  A second book,  Accelerating Discovery: Mining Unstructured Information for Hypothesis Generation, describes my more recent appoaches.



I was born in Florida, but grew up in Lynchburg, Virginia. I moved to Michigan while in high school, and, there, having nothing better to do, began to concentrate seriously on my studies. In school I found my two great loves were mathematics and literature.

I applied to MIT, and was astonished to be accepted. There I dabbled in Math, Physics, Literature, History, Economics, without really being sure what I wanted to do. After my junior year I stumbled upon a summer job at General Motors doing expert system development. I enjoyed that experience so much I decided to get my degree in "Math with computer science" (I was actually the first recipient of this degree in the newly created course 18C in 1986).

GM gave me a fellowship to continue my studies at the University of Texas in Machine Learning. After getting my Masters, U of T offered me a teaching fellowship to continue on for my Ph.D., but instead I preferred to get started with real life.

I went to work full time at the GM Tech Center in Warren, MI. There I designed knowledge based systems and built some of the earliest data mining applications to diagnose automotive faults. I worked with the late Dr. Tom Lorenzen, a truly amazing person and statistician who taught me everything useful I know about how to get the greatest possible insight out of real world data. I also worked with Dr. Sam Uthurusamy who helped me early on to see the great potential of data mining and of the internet.

During my stay at GM I met my wife, Karon Barber, and won GM Kettering award in 1992 for the Design of Experiments Expert System (DEXPERT). Karon also won this award 3 years later for a Gear Design Expert System, making us the only married couple ever to win this prestigious honor.

I left GM for IBM in San Jose, CA in 1996 to work in a Web Mining group headed by Evangelous Simoudis. There I worked with Dharmendra Modha to develop some novel data visualization techniques that are used to help understand the results of kmeans clustering.

In 1998, after Evangelous left, I joined IBM Reserach, and began working on Text Mining of help desk problem tickets under Norm Pass. This work gradually expanded into many other areas and applications dealing with mining all kinds of unstructured information. I believe this work has turned out to be my true calling, combining my love of math, reading, and knowledge based systems, along with my love of solving complex problems.