Program at a Glance

Tuesday, September 8

09:00 OAIS Workshop - Room A101 SW4CH Workshop
Invited Talk – Integrating cultural data using an ontology-based framework - Room A102
BIGDAP Workshop - Room A202 MEBIS and GID Workshops
Session 1 – Evolving BI Systems - Room A203
10:30 Coffee Break
11:00 OAIS Workshop - Room A101 SW4CH Workshop
Session 1 – CIDOC CRM Real-life Use - Room A102
BIGDAP Workshop - Room A202 MEBIS and GID Workshops
Session 2 – GPUs in Databases and Data Warehouses - Room A203
12:30 Lunch Break (Plaza Hotel)
13:50 WISARD Workshop - Room A201
14:00 Tutorial – Towards an Era of Trust in Personal Data Management, Part 1 - Room A101 SW4CH Workshop
Session 2 – Cultural Heritage Preservation and Enhancement - Room A102
BIGDAP Workshop - Room A202 DCSA Workshop - Room A203
15:30 Coffee Break
16:00 Tutorial – Towards an Era of Trust in Personal Data Management, Part 2 - Room A101 SW4CH Workshop
Session 3 – Entity linking for Cultural Heritage - Room A102
DCSA Workshop - Room A203 WISARD Workshop - Room A201
17:30
18:30 Satellite Events Reception

Wednesday, September 9

09:00 Opening Session - Room A101
09:30 Keynote – The Story of Webdamlog - Room A101
10:30 Coffee Break
11:00 Research Session – Database Theory & Access Methods - Room A101 Research Session – User Requirements & Database Evolution - Room A102 Research Session – Multidimensional Modeling & OLAP - Room A202
12:30 Lunch Break (Plaza Hotel) Steering Committee Lunch Meeting (Plaza Hotel)
14:30 Research Session – ETL - Room A101 Research Session – Time Series Processing - Room A102 Research Session – Preferences & Recommender Systems - Room A202
16:30
16:45 Historic Tour of Poitiers
19:00 Reception at the City Hall

Thursday, September 10

09:00 Tutorial – Query Processing: Beyond SQL and Relations, Part 1 - Room A101 Research Session – Transformation & Extraction - Room A102
10:30 Coffee Break
11:00 Tutorial – Query Processing: Beyond SQL and Relations, Part 2 - Room A101 Research Session – Ontologies - Room A102
12:30 Lunch Break
14:00 Keynote – The Case for Small Data Management - Room A102
15:00 Coffee Break
15:30 Research Session – Advanced Query Processing - Room A102 Research Session – New Trends in Data - Room A202 Research Session – Web Content - Room A203
17:30
18:30 Conference Dinner

Friday, September 11

09:00 Research Session – Advanced Design Modeling - Room A101 Research Session – Performance & Tuning - Room A202
10:30 Coffee Break
11:00 Research Session – Approximation & Skyline - Room A101 Research Session – Confidentiality & Trust - Room A202
12:30 Closing Session
13:00 Lunch Break

Keynotes

by Serge Abiteboul

Room: A101

Chair: Patrick Valduriez

Abstract

We summarize works about the management of data in a distributed manner based on Webdamlog, a datalog-extension.

by Jens Dittrich

Room: A102

Chair: Tadeusz Morzy

Abstract

Exabytes of data; several hundred thousand TPC-C transactions per second on a single computing core; scale-up to hundreds of cores and a dozen Terabytes of main memory; scale-out to thousands of nodes with close to Petabyte-sized main memories; and massively parallel query processing are a reality in data management. But, hold on a second: for how many users exactly? How many users do you know that really have to handle these kinds of massive datasets and extreme query workloads? On the other hand: how many users do you know that are fighting to handle relatively small datasets, say in the range of a few thousand to a few million rows per table? How come some of the most popular open source DBMS have hopelessly outdated optimizers producing inefficient query plans? How come people don't care and love it anyway? Could it be that most of the worlds data management problems are actually quite small? How can we increase the impact of database research in areas when datasets are small? What are the typical problems? What does this mean for database research? We discuss research challenges, directions, and a concrete technical solution coined PDbF: Portable Database Files (open source at https://github.com/uds-datalab/PDBF). See also our VLDB 2015 demo.

Tutorials

by Nicolas Anciaux, Benjamin Nguyen and Iulian Sandu Popa

Room: A101

Abstract

Managing personal data with strong privacy guarantees has become an important topic in an age where your glasses record and share everything you see, your wallet records and shares your financial transactions, and your set-top box records and shares your energy consumption, while several recent affairs have unveiled the severe consequences of the loss of privacy. In this context, more and more alternatives are proposed based on user centric and decentralized solutions, capitalizing on the use of trusted personal devices controlling the data at the edges of the Internet. Decentralized solutions are promising because they do not exhibit the intrinsic limitations of classical centralized solutions, e.g., sudden changes in privacy policies of companies holding the data, data exposures by negligence or because it is regulated by too weak policies, exposure to sophisticated attacks whose benefit/cost ratio is high for centralized databases. Hence, such solutions appear as a sea change for personal data management, where the control over personal data is pushed to the edges of the Internet, within sensors acquiring the data and in a variety of user devices endowed with a form of trust, e.g., tamper-resistant secure hardware-based devices.
This tutorial reviews several existing solutions going in this direction, presents a functional architecture encompassing these alternatives, and exposes the underlying techniques and open issues dealing with user centric and decentralized data management platforms. In a first part, we review the recent initiatives pursuing the objective of reestablishing user control over their data by decentralizing this control in personal secure or trusted devices. We discuss an abstract distributed architecture focusing on secure storing, managing and sharing of personal data, i.e., the asymmetric architecture, and indicate the main challenges inherent to decentralized data management. In a second part, we explore data management techniques exercised within a trusted device at the client side. We review the main attempts proposed in the literature and concentrate on those addressing the specific context of microcontrollers equipping sensors and mobile phones (SIM cards). In a third part, we investigate the problem of performing global processing without any compromise on data privacy. We present the difficulties to overcome to execute privacy preserving computations on populations of personal devices, and illustrate it by focusing on Group By SQL queries and Privacy Preserving Data Publishing. In a fourth part, we conclude the tutorial by presenting existing and future instances of decentralized privacy preserving data management architectures. We mainly focus on attempts and proposals targeting social-medical, smart houses, and rural areas contexts.

by Boris Novikov

Room: A101

Abstract

Query processing and optimization are essential for any data processing system since introduction of high-level declarative query languages in early 80-ies. During the last decade several new techniques were introduced in order to address requirements of new classes of applications, data models, storage and indexing, and querying paradigms. Modern query processing and optimization extends far beyond relational queries. Several techniques were revised and a number of new techniques have been introduced to make the query processing efficient. Several systems that were originally designed as low-level storage facilities implementing persistence layer, were augmented with high level declarative features. The declarative scripting languages provide a technique for easy-to-understand specification of complex analytical scenarios that look like sequential but are executed on massively parallel systems. The main focus of this tutorial is on the query optimization and processing in new environments and for new classes of applications.
Although many of declarative languages are designed as extensions to SQL, the internals of the implementations usually have significant differences with well-known optimization and processing techniques developed for relational systems using row-based storage structures. Column stores are considered to be the most efficient for analytical processing on modern hardware. The physical algebraic operations for column stores differ from those used in row-based ones, and optimization strategies and heuristics are different. Distributed data processing systems such as Hadoop weren't originally intended for declarative query processing. However, several query languages are implemented on top, bringing back the need for optimization. Examples of these languages and systems include ASTERIX, SCOPE, and Apache Hive. Processing of semi-structured and unstructured data ultimately requires fuzzy (e.g. similarity) queries resulting in several obstacles for relational optimizers that are mostly oriented on re-ordering of join operations. Although some of recently introduced techniques, such as efficient top-down enumeration algorithms might be helpful, many issues are still open. Parametric and dynamic optimization techniques seem to be especially useful for distributed heterogeneous environments where availability of data statistics is often severely limited and cost estimations are unreliable. Finally, holistic optimization is an emerging technology that optimizes the database queries and application together with the goal to improve the overall application performance.

Research Sessions

Room: A101

Chair: Yannis Manolopoulos

  1. Conditional Differential Dependencies (CDDs) by Selasi Kwashie, Jixue Liu, Jiuyong Li and Feiyue Ye (long paper)
  2. Revisiting the Definition of the Relational Tuple Calculus by Bader Albdaiwi and Bernhard Thalheim (short paper)
  3. Improving the Pruning Ability of Dynamic Metric Access Methods with Local Additional Pivots and Anticipation of Information by Paulo H. Oliveira, Caetano Traina Jr. and Daniel S. Kaster (long paper)

Room: A102

Chair: Marite Kirikova

  1. Two Phase User Driven Schema Matching by Nick Bozovic and Vasilis Vasalos (long paper)
  2. CoDEL - A Relationally Complete Language for Database Evolution by Kai Herrmann, Hannes Voigt, Andreas Behrend and Wolfgang Lehner (long paper)
  3. A Requirements Specification Framework for Big Data Collection and Capture by Noufa Al-Najran and Ajantha Dahanayake (short paper)

Room: A202

Chair: Orlando Belo

  1. Implementation of multidimensional databases in column-oriented NoSQL systems by Max Chevalier, Mohammed El Malki, Arlind Kopliku, Olivier Teste and Ronan Tournier (long paper)
  2. A Framework for Building OLAP Cubes on Graphs by Amine Ghrab, Oscar Romero, Sabri Skhiri, Alejandro Vaisman and Esteban Zimányi (long paper)
  3. A Generic Data Warehouse Architecture for Analyzing Workflow Logs by Christian Koncilia, Horst Pichler and Robert Wrembel (long paper)

Room: A101

Chair: Helena Galhardas

  1. HBelt: Integrating an Incremental ETL Pipeline with a Big Data Store for Real-time Analytics by Weiping Qu, Sahana Shankar, Sandy Ganza and Stefan Dessloch (long paper)
  2. Two-ETL phases for Data Warehouse creation: Design and Implementation by Ahlem Nabli, Senda Bouaziz, Rania Yangui and Faiez Gargouri (long paper)
  3. AutoScale: Automatic ETL scale process by Pedro Martins, Maryam Abbasi and Pedro Furtado (short paper)
  4. Using a Domain-Specific Language to Enrich ETL Schemas by Orlando Belo, Claudia Gomes, Bruno Oliveira, Ricardo Marques and Vasco Santos (short paper)

Room: A102

Chair: Christian Koncilia

  1. ForCE: Is Estimation of Data Completeness Through Time Series Forecasts Feasible? by Gregor Endler, Philipp Baumgärtel, Andreas M. Wahl and Richard Lenz (long paper)
  2. Best-match Time Series Subsequence Search on the Intel Many Integrated Core Architecture by Mikhail Zymbler (long paper)
  3. Feedback Based Continuous Skyline Queries over a Distributed Framework by Ahmed Khan Leghari, Jianneng Cao and Yongluan Zhou (long paper)
  4. Continuous Query Processing over Data, Streams and Services: Application to Robotics by Vasile-Marian Scuturici, Yann Gripay, Jean-Marc Petit, Yutaka Deguchi and Einoshin Suzuki (short paper)

Room: A202

Chair: Alsayed Algergawy

  1. The Structure of Preference Orders by Markus Endres (long paper)
  2. Database Querying in the Presence of Suspect Values by Olivier Pivert and Henri Prade (short paper)
  3. Context-Awareness and Viewer Behavior Prediction in Social-TV Recommender Systems: Survey and Challenges by Mariem Bambia, Rim Faiz and Mohand Boughanem (short paper)
  4. Generalized Bichromatic Homogeneous Vicinity Query Algorithm in Road Network Distance by Yutaka Ohsawa, Htoo Htoo, Naw Jacklin Nyunt and Myint Myint Sein (short paper)

Room: A102

Chair: Robert Wrembel

  1. Direct Transformation Techniques for Compressed Data: General Approach and Application Scenarios by Patrick Damme, Dirk Habich and Wolfgang Lehner (long paper)
  2. Analysis of the Blocking Behaviour of Schema Transformations in Relational Database Systems by Lesley Wevers, Matthijs Hofstra, Menno Tammens, Marieke Huisman and Maurice van Keulen (long paper)
  3. A Benchmark for Relation Extraction Kernels by João L. M. Pereira, Helena Galhardas and Bruno Martins (long paper)

Room: A102

Chair: Maria Keet

  1. Ontological Commitments, DL-Lite Logics and Reasoning Tractability by Mauricio Minuto Espil, María Gabriela Ojea and Maria Alejandra Ojea (long paper)
  2. SeeCOnt: A New Seeding-based Clustering Approach For Ontology Matching by Alsayed Algergawy, Samira Babalou, Mohammad J. Kargar and S. Hashem Davarpanah (long paper)
  3. SLA Ontology-Based Elasticity in Cloud Computing by Taher Labidi, Achraf Mtibaa and Faiez Gargouri (short paper)

Room: A102

Chair: Jaroslav Pokorný

  1. A Self-Tuning Framework for Cloud Storage Clusters by Siba Mohammad, Eike Schallehn and Gunter Saake (long paper)
  2. Incrementally Maintaining Materialized Temporal Views in Column-oriented NoSQL Databases with Partial Deltas by Yong Hu and Stefan Dessloch (short paper)
  3. Towards self-management in a distributed column-store system by George Chernishev (short paper)
  4. Optimizing Sort in Hadoop using Replacement Selection by Pedro Martins Dusso, Caetano Sauer and Theo Häerder (long paper)

Room: A202

Chair: Einoshin Suzuki

  1. Distributed Sequence Pattern Detection over Multiple Data Streams by Ahmed Khan Leghari, Jianneng Cao and Yongluan Zhou (long paper)
  2. Relational-Based Sensor Data Cleansing by Nadeem Iftikhar, Xiufeng Liu and Finn Ebertsen Nordbjerg (short paper)
  3. Avoiding Ontology Confusion in ETL Processes by Selma Khouri, Sabrina Abdellaoui and Fahima Nader (short paper)
  4. Towards A Generic Approach for the Management and the Assessment of Cooperative Work by Amina Cherouana, Amina Aouine, Abdelaziz Khadraoui and Latifa Mahdaoui (short paper)

Room: A203

Chair: Johann Gamper

  1. Web Content Management Systems Archivability by Vangelis Banos and Yannis Manolopoulos (long paper)
  2. MLES: Multilayer Exploration Structure for Multimedia Exploration by Juraj Moško, Jakub Lokoč, Tomáš Grošup, Přemysl Čech, Tomáš Skopal and Jan Lánský (short paper)

Room: A101

Chair: Bernhard Thalheim

  1. Evidence-based Languages for Conceptual Data Modelling Profiles by Pablo R. Fillottrani and C. Maria Keet (long paper)
  2. OLAP4Tweets: Multidimensional Modeling of tweets by Maha Ben Kraiem, Jamel Feki, Kaïs Khrouf, Franck Ravat and Olivier Teste (short paper)
  3. Data Warehouse Design Methods Review: Trends, Challenges and Future Directions for the Healthcare Domain by Christina Khnaisser, Luc Lavoie, Hassan Diab and Jean-François Ethier (short paper)

Room: A202

Chair: Boris Novikov

  1. Partitioning Templates for RDF by Rebeca Schroeder and Carmem S. Hara (long paper)
  2. Efficient Computation of Parsimonious Temporal Aggregation by Giovanni Mahlknecht, Anton Dignös and Johann Gamper (long paper)
  3. TDQMed: Managing Collections of Complex Test Data by Johannes Held and Richard Lenz (long paper)

Room: A101

Chair: Yannis Manolopoulos

  1. Space-bounded query approximation by Boris Cule, Floris Geerts and Reuben Ndindi (long paper)
  2. Bi-objective Optimization for Approximate Query Evaluation by Anna Yarygina and Boris Novikov (short paper)
  3. Hybrid Web Service Discovery Based on Fuzzy Condorcet Aggregation by Fethallah Hadjila, Amel Halfaoui and Amine Belabed (long paper)

Room: A202

Chair: Ladjel Bellatreche

  1. Confidentiality Preserving Evaluation of Open Relational Queries by Joachim Biskup, Martin Bring and Michael Bulinski (long paper)
  2. A General Trust Management Framework for Provider Selection in Cloud Environment by Fatima Zohra Filali and Belabbas Yagoubi (long paper)
  3. Sybil Tolerance and Probabilistic Databases to Compute Web Services Trust by Zohra Saoud, Noura Faci, Zakaria Maamar and Djamal Benslimane (long paper)

Workshops

Website: http://oais2015.ensma.fr

Room: A101

  • Mobile Co-Authoring of Linked Data in the Cloud by Moulay Driss Mechaoui, Nadir Guetmi and Abdessamad Imine
  • Ontology based Linkage between Enterprise Architecture, Processes, and Time by Marite Kirikova, Ludmila Penicina and Andrejs Gaidukovs
  • Fuzzy Inference-based Ontology Matching Using Upper Ontology by S. Hashem Davarpanah, Alsayed Algergawy and Samira Babalou
  • AAn ontology-based approach for handling explicit and implicit knowledge over trajectories by Rouaa Wannous, Cécile Vincent, Jamal Malki and Alain Bouju
  • Interpretation of DD-LOTOS specication by C-DATA* by Toufik Messaoud Maarouk, Djamel Eddine Saïdouni, Rafik Mahdaoui and Hichem Houassi

Website: http://sw4ch2015.ensma.fr

Room: A102

Invited Talk

Integrating cultural data using an ontology-based framework

by Martin Rezk

Abstract

In this talk we will introduce an ontology-based data access framework that allows to virtually integrate different databases by means of a conceptual layer (an ontology). The ontology provides a convenient query vocabulary to the user, and a unified view of the underlying data. The ontology is connected to the data sources through a declarative specification given in terms of mappings. I will illustrate how to integrate cultural data by relying on a OBDA framework. In particular, I will concentrate on the following crucial questions:

  • How this paradigm can contribute to ease the access of scholars to cultural heritage data: integrating temporal and spatial data, cross-linking datasets.
  • What is the theory behind it.
  • How to map available data sources to an ontology.
  • How to query the underlying data sources using the terms in the ontology.
  • How to check consistency of the data sources w.r.t. the ontology.

Session 1: CIDOC CRM Real-life Use

  1. Knowledge Representation in EPNet by Alessandro Mosca, Joé Remesal, Martin Rezk and Guillem Rull
  2. A Pattern-based Framework for Best Practice Implementation of CRM/FRBRoo by Trond Aalberg, Audun Vennesland and Maliheh Farrokhnia
  3. Application of CIDOC-CRM for the Russian Heritage Cloud platform by Eugene Cherny, Peter Haase, Dmitry Mouromtsev, Alexey Andreev and Dmitry Pavlov

Session 2: Cultural Heritage Preservation and Enhancement

  1. Designing for Inconsistency – The Dependency-based PERICLES Approach by Jean-Yves Vion-Dury, Nikolaos Lagos, Efstratios Kontopoulos, Marina Riga, Panagiotis Mitzias, Georgios Meditskos, Simon Waddington, Pip Laurenson and Ioannis Kompatsiaris
  2. A Semantic exploration method based on an ontology of 17th century texts on theatre: la Haine du theatre by Chiara Mainardi, Zied Sellami and Vincent Jolivet
  3. Combining semantic and collaborative recommendations to generate personalized museum tours by Idir Benouaret and Dominique Lenne

Session 3: Entity linking for Cultural Heritage

  1. Improving Retrieval of Historical Content with Entity Linking by Max De Wilde
  2. A Novel Vision for Navigation and Enrichment in Cultural Heritage Collections by Joffrey Decourselle, Audun Vennesland, Trond Aalberg, Fabien Duchateau and Nicolas Lumineau
  3. Disambiguation of Named Entities in cultural heritage texts using Linked Data sets by Carmen Brando, Francesca Frontini and Jean-Gabriel Ganascia

Website: http://dbdmg.polito.it/bigdap2015/

Room: A202

  • Cross-Checking Data Sources in MapReduce by Foto Afrati, Zaid Momani and Nikos Stasinopoulos
  • CLUS: Parallel subspace clustering algorithm on SPARK by Bo Zhu, Alexandru Mara and Alberto Mozo
  • Massively Parallel Unsupervised Feature Selection on Spark by Bruno Ordozgoiti, Sandra Gómez Canaval and Alberto Mozo
  • Unsupervised Network Anomaly Detection in Real-time on Big Data by Juliette Dromard, Gilles Roudière and Philippe Owezarski
  • NPEPE: Massive Natural Computing Engine for Optimally Solving NP-complete Problems in Big Data Scenarios by Sandra Gómez Canaval, Bruno Ordozgoiti Rubio and Alberto Mozo
  • Andromeda: A System for Processing Queries and Updates on Big XML Documents by Nicole Bidoit, Dario Colazzo, Carlo Sartiani, Alessandro Solimando and Federico Ulliana
  • Fast and effective decision support for crisis management by the analysis of people's reactions collected from Twitter by Antonio Attanasio, Louis Jallet, Antonio Lotito, Michele Osella and Francesco Ruà
  • Adaptive Quality of Experience: a novel approach to real-time big data analysis in core networks by Alejandro Bascuñana, Manuel Lorenzo, Miguel-Ángel Monjas and Patricia Sánchez
  • A review of scalable approaches for Frequent Itemset Mining by Daniele Apiletti, Paolo Garza and Fabio Pulvirenti

Website: http://www.irit.fr/wisard2015/

Room: A201

Session 1 (starts at 13:50)

  • Workshop Opening by Florence Seded
  1. ADMAN: an Alarm-based mobile Diabetes MANagement system for mobile geriatric teams by Dana Al Kukhun, Bouchra Soukkarieh and Florence Sèdes
  2. Abduction for Analysing Data Exchange Policies by Laurence Cholvy
  3. An Architectural Roadmap Towards Building an Alarm Diffusion System by Sumit Kalra, T. V. Prabhakar and Saurabh Srivastava
  4. A case study on the influence of the user profile enrichment on Buzz propagation in social media: Experiments on Delicious by Manel Mezghani, Sirinya On-At, André Peninou, Marie-Françoise Canut, Corinne Amel Zayani, Ikram Amous and Florence Sèdes

Session 2

  1. Critical Information Diffusion Systems by Rémi Delmas and Thomas Polacsek
  2. Information exchange policies at an organisational level: formal expression and analysis by Claire Saurel

Website: http://www.is.informatik.uni-kiel.de/en/is/events/dcsa2015adbis/

Room: A203

  • A Mutual Resource Exchanging Model and its Applications to Data Analysis in Mobile Environment by Naofumi Yoshida
  • Detection of trends and opinions in geo-tagged social text streams by Jevgenij Jakunschin, Andreas Heuer and Antje Raab-Düsterhöft
  • Software Architecture for Collaborative Crowd-storming Applications by Nouf Jaafar and Ajantha Dahanayake
  • Gamification in Saudi Society: A Framework to Develop Human Values for Early Generations by Alia AlBalawi, Bariah AlSaawi, Ghada AlTassan and Zaynab Fakeerah

Websites: http://www.cs.put.poznan.pl/rwrembel/MEBIS2015.html (MEBIS) and http://gid.us.to/ (GID)

Room: A203

Session 1 : Evolving BI Systems

  1. E-ETL Framework: ETL Process Reparation Algorithms using Case-based Reasoning by Artur Wojciechowski
  2. Handling Evolving Data Warehouse Requirements by Darja Solodovnikova, Laila Niedrite and Natalija Kozmina
  3. Querying Multiversion Data Warehouses by Waqas Ahmed and Esteban Zimányi

Session 2 : GPUs in Databases and Data Warehouses

  1. CUDA-Powered CTBE Algorithm for Zero-Latency Data Warehouse by Marcin Gorawski, Damian Lis and Anna Gorawska
  2. Big Data Conditional Business Rule Calculations in Multidimensional In-GPU-Memory OLAP Databases by Alexander Haberstroh and Peter Strohm
  3. Optimizing Sorting and Top-k Selection Steps in Permutation Based Indexing on GPUs by Martin Kruliš, Hasmik Osipyan and Stéphane Marchand-Maillet

Social Events

Historic Tour of Poitiers

Visit of the Roman Catholic church Notre-Dame-La-Grande, the Palace of the Counts of Poitou-Dukes of Aquitaine and their neighborhood.

Conference Dinner

The conference dinner will take place at the Château de la Mothe en Poitou.