Intelligent Knowledge Extraction From a Document Database

Extracting knowledge from big databases and from document databases has long been a challenge. This session presents a small example: the case of the literature review that every researcher performs in the course of their studies. Most researchers do a literature review and manage their own document database. There are several tools available to do that (e.g., EndNote, Mendeley, Word, and Excel) but those tools just provide a bibliography.

When retrieving quantitative information (a list of concepts or items) there is a severe lack of functionality. The case proposed for this session is a “little brother” of the general problem and the approach, methodology, and principles used in this example are the same as those used in bigger cases, although the IT tools required are much simpler. When searching for concepts in a document database (e.g., ideas, subjects) it is necessary to perform a previous concept analysis to define the keywords which will be used later on; however, when searching for items (e.g., to compile a list of risks), the task is easier because it merely consists of finding words in the text.


Attendees will learn the keys to creating a big document database, to extracting information from it according to their predefined breakdown structure, and to obtaining a ranked list of concepts and items to define priorities and to make decisions. Those results are relevant for researchers and they are an example of what companies could do to organize and use their stored information simply and effectively.

Speaker profiles

Dr. Vegas‐Fernández has been CIO for more than 25 years in several Spanish IBEX-35 companies, Organization Manager for 10 years, and Risk Manager for 3 years, and has researched for 6 years writing a doctoral thesis. He has received two Best Innovation Idea awards related to Competitive Intelligence applied to risk assessment. He has been a member of professional associations such as ASIS, CIONET, ARIA, and Agers (FERMA).

His broad experience helps him to focus on the board’s needs with an overall view of the business. He has published articles about IT and risk in indexed journals and has given conferences in PMI (Valencia Chapter), Institute for Competitive Intelligence, Agers Annual Congress, and others. As a professor, he has given lectures at the university in Master’s programs about IT, risks and the construction industry. He currently researches at Universidad Politécnica de Madrid and is Executive Advisor at IDC.