Back to Schedule
This 20 minutes talk describes an automated data processing system, ComputableFacts, whose goal is to recover information from unstructured data in a variety of formats (such as Microsoft Office or Adobe PDF documents, emails, web pages, etc.) and convert it into a more usable form. Its key features are:
- Enforce authorizations across multiple access models to the database: batch, interactive and real-time.
- Extract data and metadata from a variety of sources and file formats
- Provides a uniform representation of all data, regardless of its initial structure or format
- Build facts databases manually and/or automatically
- Automatically derive new facts using rules
- Execute complex queries
- Allow users to create alerts
- Allow users to share and comment on documents
- Allow users to create and export query-focused datasets
- Allow users to rate documents. Later, recommend them documents of interest
Cyrille SAVELIEF is 34 years old and has worked 5 years as a Data Engineer for FactSet Research Systems Inc where he specialized in assisted and automated facts extractions from analysts research reports and financial statements. In 2011 he co-founded MNCC, a company providing services in Data Engineering & Analytics. During the past 3 years he specialized in applying NLP in the context of exploratory analysis of heterogeneous documents collections.