Research Engineer - Semantic Analysis and Integration




I am a member of Statistical Content Analysis group at IBM Thomas J. Watson Research Center NY. Before that i spent close to 4 years in IBM Research, India, where i was a part of the HLT group. My topic of interest spans several areas like information retrieval, information extraction and text mining.

My M Tech Thesis with Prof Soumen Chakrabarti involved developing CSAW, a system for Curating and Searching the Annotated Web. CSAW annotates named entities in Web-scale text corpora, and, where confident, connects these annotations with entries in an entity and type catalog such as Wikipedia. The semi-structured catalog, together with the unstructured corpus, forms a composite database that CSAW can then search using powerful reachability, proximity and aggregation primitives. CSAW comprises of billions of annotation links between a 500-million web page corpus and millions of entities known to Wikipedia. Prior to IITB days, I worked on middle-ware applications for 3.5 years.

