|09:00||OAIS Workshop - Room A101||SW4CH Workshop
Invited Talk – Integrating cultural data using an ontology-based framework - Room A102
|BIGDAP Workshop - Room A202||MEBIS and GID Workshops
Session 1 – Evolving BI Systems - Room A203
|11:00||OAIS Workshop - Room A101||SW4CH Workshop
Session 1 – CIDOC CRM Real-life Use - Room A102
|BIGDAP Workshop - Room A202||MEBIS and GID Workshops
Session 2 – GPUs in Databases and Data Warehouses - Room A203
|12:30||Lunch Break (Plaza Hotel)|
|13:50||WISARD Workshop - Room A201|
|14:00||Tutorial – Towards an Era of Trust in Personal Data Management, Part 1 - Room A101||SW4CH Workshop
Session 2 – Cultural Heritage Preservation and Enhancement - Room A102
|BIGDAP Workshop - Room A202||DCSA Workshop - Room A203|
|16:00||Tutorial – Towards an Era of Trust in Personal Data Management, Part 2 - Room A101||SW4CH Workshop
Session 3 – Entity linking for Cultural Heritage - Room A102
|DCSA Workshop - Room A203||WISARD Workshop - Room A201|
|18:30||Satellite Events Reception|
|09:00||Opening Session - Room A101|
|09:30||Keynote – The Story of Webdamlog - Room A101|
|11:00||Research Session – Database Theory & Access Methods - Room A101||Research Session – User Requirements & Database Evolution - Room A102||Research Session – Multidimensional Modeling & OLAP - Room A202|
|12:30||Lunch Break (Plaza Hotel)||Steering Committee Lunch Meeting (Plaza Hotel)|
|14:30||Research Session – ETL - Room A101||Research Session – Time Series Processing - Room A102||Research Session – Preferences & Recommender Systems - Room A202|
|16:45||Historic Tour of Poitiers|
|19:00||Reception at the City Hall|
|09:00||Tutorial – Query Processing: Beyond SQL and Relations, Part 1 - Room A101||Research Session – Transformation & Extraction - Room A102|
|11:00||Tutorial – Query Processing: Beyond SQL and Relations, Part 2 - Room A101||Research Session – Ontologies - Room A102|
|14:00||Keynote – The Case for Small Data Management - Room A102|
|15:30||Research Session – Advanced Query Processing - Room A102||Research Session – New Trends in Data - Room A202||Research Session – Web Content - Room A203|
|09:00||Research Session – Advanced Design Modeling - Room A101||Research Session – Performance & Tuning - Room A202|
|11:00||Research Session – Approximation & Skyline - Room A101||Research Session – Confidentiality & Trust - Room A202|
by Serge Abiteboul
Chair: Patrick ValduriezAbstract
We summarize works about the management of data in a distributed manner based on Webdamlog, a datalog-extension.
by Jens Dittrich
Chair: Tadeusz MorzyAbstract
Exabytes of data; several hundred thousand TPC-C transactions per second on a single computing core; scale-up to hundreds of cores and a dozen Terabytes of main memory; scale-out to thousands of nodes with close to Petabyte-sized main memories; and massively parallel query processing are a reality in data management. But, hold on a second: for how many users exactly? How many users do you know that really have to handle these kinds of massive datasets and extreme query workloads? On the other hand: how many users do you know that are fighting to handle relatively small datasets, say in the range of a few thousand to a few million rows per table? How come some of the most popular open source DBMS have hopelessly outdated optimizers producing inefficient query plans? How come people don't care and love it anyway? Could it be that most of the worlds data management problems are actually quite small? How can we increase the impact of database research in areas when datasets are small? What are the typical problems? What does this mean for database research? We discuss research challenges, directions, and a concrete technical solution coined PDbF: Portable Database Files (open source at https://github.com/uds-datalab/PDBF). See also our VLDB 2015 demo.
by Nicolas Anciaux, Benjamin Nguyen and Iulian Sandu Popa
Managing personal data with strong privacy guarantees has become an important topic in an age where your glasses record and share everything you see, your wallet records and shares your financial transactions, and your set-top box records and shares your energy consumption, while several recent affairs have unveiled the severe consequences of the loss of privacy. In this context, more and more alternatives are proposed based on user centric and decentralized solutions, capitalizing on the use of trusted personal devices controlling the data at the edges of the Internet. Decentralized solutions are promising because they do not exhibit the intrinsic limitations of classical centralized solutions, e.g., sudden changes in privacy policies of companies holding the data, data exposures by negligence or because it is regulated by too weak policies, exposure to sophisticated attacks whose benefit/cost ratio is high for centralized databases. Hence, such solutions appear as a sea change for personal data management, where the control over personal data is pushed to the edges of the Internet, within sensors acquiring the data and in a variety of user devices endowed with a form of trust, e.g., tamper-resistant secure hardware-based devices.
This tutorial reviews several existing solutions going in this direction, presents a functional architecture encompassing these alternatives, and exposes the underlying techniques and open issues dealing with user centric and decentralized data management platforms. In a first part, we review the recent initiatives pursuing the objective of reestablishing user control over their data by decentralizing this control in personal secure or trusted devices. We discuss an abstract distributed architecture focusing on secure storing, managing and sharing of personal data, i.e., the asymmetric architecture, and indicate the main challenges inherent to decentralized data management. In a second part, we explore data management techniques exercised within a trusted device at the client side. We review the main attempts proposed in the literature and concentrate on those addressing the specific context of microcontrollers equipping sensors and mobile phones (SIM cards). In a third part, we investigate the problem of performing global processing without any compromise on data privacy. We present the difficulties to overcome to execute privacy preserving computations on populations of personal devices, and illustrate it by focusing on Group By SQL queries and Privacy Preserving Data Publishing. In a fourth part, we conclude the tutorial by presenting existing and future instances of decentralized privacy preserving data management architectures. We mainly focus on attempts and proposals targeting social-medical, smart houses, and rural areas contexts.
by Boris Novikov
Query processing and optimization are essential for any data processing system since introduction of high-level declarative query languages in early 80-ies. During the last decade several new techniques were introduced in order to address requirements of new classes of applications, data models, storage and indexing, and querying paradigms.
Modern query processing and optimization extends far beyond relational queries. Several techniques were revised and a number of new techniques have been introduced to make the query processing efficient. Several systems that were originally designed as low-level storage facilities implementing persistence layer, were augmented with high level declarative features. The declarative scripting languages provide a technique for easy-to-understand specification of complex analytical scenarios that look like sequential but are executed on massively parallel systems.
The main focus of this tutorial is on the query optimization and processing in new environments and for new classes of applications.
Although many of declarative languages are designed as extensions to SQL, the internals of the implementations usually have significant differences with well-known optimization and processing techniques developed for relational systems using row-based storage structures. Column stores are considered to be the most efficient for analytical processing on modern hardware. The physical algebraic operations for column stores differ from those used in row-based ones, and optimization strategies and heuristics are different. Distributed data processing systems such as Hadoop weren't originally intended for declarative query processing. However, several query languages are implemented on top, bringing back the need for optimization. Examples of these languages and systems include ASTERIX, SCOPE, and Apache Hive. Processing of semi-structured and unstructured data ultimately requires fuzzy (e.g. similarity) queries resulting in several obstacles for relational optimizers that are mostly oriented on re-ordering of join operations. Although some of recently introduced techniques, such as efficient top-down enumeration algorithms might be helpful, many issues are still open. Parametric and dynamic optimization techniques seem to be especially useful for distributed heterogeneous environments where availability of data statistics is often severely limited and cost estimations are unreliable. Finally, holistic optimization is an emerging technology that optimizes the database queries and application together with the goal to improve the overall application performance.
Chair: Yannis Manolopoulos
Chair: Marite Kirikova
Chair: Orlando Belo
Chair: Helena Galhardas
Chair: Christian Koncilia
Chair: Alsayed Algergawy
Chair: Robert Wrembel
Chair: Maria Keet
Chair: Jaroslav Pokorný
Chair: Einoshin Suzuki
Chair: Johann Gamper
Chair: Bernhard Thalheim
Chair: Boris Novikov
Chair: Yannis Manolopoulos
Chair: Ladjel Bellatreche
by Martin RezkAbstract
In this talk we will introduce an ontology-based data access framework that allows to virtually integrate different databases by means of a conceptual layer (an ontology). The ontology provides a convenient query vocabulary to the user, and a unified view of the underlying data. The ontology is connected to the data sources through a declarative specification given in terms of mappings. I will illustrate how to integrate cultural data by relying on a OBDA framework. In particular, I will concentrate on the following crucial questions:
Websites: http://www.cs.put.poznan.pl/rwrembel/MEBIS2015.html (MEBIS) and http://gid.us.to/ (GID)
Visit of the Roman Catholic church Notre-Dame-La-Grande, the Palace of the Counts of Poitou-Dukes of Aquitaine and their neighborhood.
The conference dinner will take place at the Château de la Mothe en Poitou.