Back to Schedule
This 20 minutes talk describes an automated data processing system, ComputableFacts, whose goal is to recover information from unstructured data in a variety of formats (such as Microsoft Office or Adobe PDF documents, emails, web pages, etc.) and convert it into a more usable form. Its key features are :
* Enforce authorizations across multiple access models to the database: batch, interactive and real-time.
* Data Engineering:
* Extract data and metadata from a variety of sources and file formats;
* Provides a uniform representation of all data, regardless of its initial structure or format.
* Knowledge Engineering:
* Build facts databases manually and/or automatically;
* Automatically derive new facts using rules;
* Execute complex queries.
* Knowledge Dissemination:
* Allow users to create alerts;
* Allow users to share and comment on documents;
* Allow users to create and export query-focused datasets;
* Allow users to rate documents. Later, recommend them documents of interest.
Cyrille SAVELIEF is 34 years old and has worked 5 years as a Data Engineer for FactSet Research Systems Inc where he specialized in assisted and automated facts extractions from analysts research reports and financial statements. In 2011 he co-founded MNCC, a company providing services in Data Engineering & Analytics. During the past 3 years he specialized in applying NLP in the context of exploratory analysis of heterogeneous documents collections.