IBM.Next: QuerioDALI       

links

Spyros Kotoulas photoVanessa Lopez  photoPierpaolo Tommasi photo

IBM.Next: QuerioDALI - overview


QuerioDALI: Open-Domain Question Answering over Linked Data

GOALS:

- Explore natural ways  to build up queries to answer complex information needs over large,  heterogeneous and distributed knowledge graphs (KGs), obtained from both open and enterprise semi-structured data.  

- Users can pose voice or text queries in natural language. The Watson NLP pipeline is used to understand the Natural Language queries and then QuerioDALI translates the parsed queries into Graph Patterns (i.e., formal graph queries),  to obtain answers, evidences and ultimately create views and explorations derived from their questions, as well as the geo-spatial, temporal, social and semantic context in a given scenario.

CHALLENGES AND CONTRIBUTIONS

- Exploit Watson NLP, IBM.Next cognitive tools, such as speech-to-text and DALI, to  query knowledge graphs (KGs) obtained from Open Linked Data or for a particular scenario (e..g, Smarter Care).

- The data may come from semantic repositories (e.g., in the Data Lake, or accessible though SPARQL end points) or using the IBM.Next tool DALI. DALI is used for Incrementally building and capturing information into a KG, where all the relevant structured and semi-structured (enterprise and open) information is meaningfully lifted  (by extracting the relevant entities, relations and datatypes) and linked to well known standards (W3C, schema.org), domain specific models (e.g., 211 taxonomy, Social care taxonomy, human diseases ontology), tabular city data sources (e.g., DubLinked) and general-knowledge Linked Open Data (LOD) sources (DBpedia, WordNet).

- Bringing to Watson Question Answering abilities on large open-domain KGs - with evolving data, no fixed schemas and non training corpus. Watson pipeline components can be reused in a novel way to answer queries that can not be answered by text fragments alone but by combining facts across semi-structured sources.  

- Presenting and ranking answers and evidences to user queries, even those that can only be answered by combining and aggregating facts  from one or more (heterogeneous) sources on the fly (i.e, uncovering connections on the fly to combine partial interpretations and answers). 

-Learn from user interaction,  as well as spatio-temporal context in the case of users on the move using mobile devices) (future work)

 

IBM.Next Demos (Internal)

- You.tube video for the smarted care scenario (with biomedical ontologies): https://youtu.be/aTCm_GCizFg 

- You.tube video for Open Domain QA over DBpedia: https://youtu.be/9_lqyOrpYbk