Database Systems Prof. Dr. Jens Dittrich, Dr. Felix Martin Schuhknecht, M.Sc Immanuel Leonard Haffner Core Lecture, 9 CP, Winter Semester 2017

News

20.11.2017

Mini-Test 4 Results

The results of mini-test 4 are available.

20.11.2017

Tutorial 1 by Ankur Sharma dropped

Due to the low utilization of the tutorial slots, we will no longer offer the tutorial slot of Ankur. Please either go to the slots offered by Maha Aburahma (Tuesday, 12:00, 3.06 E11) or Mansi Misra (Wednesday, 10:00, 3.06 E11).

15.11.2017

VirtualBox VM for Programming Tasks

We uploaded a VirtualBox VM based on arch linux, that can be used to solve the programming tasks. All relevant build tools are already pre-installed.

14.11.2017

Programming LAB Today 14:15

As already announced in the previous LAB, we will have a programming LAB today at 14:15 to continue with the first programming task.

13.11.2017

Mini-Test 3 Results

The results of mini-test 3 are available. As the average is around 7, we decreased the max. points to 10 with 2 achievable bonus points.

08.11.2017

LAB November 09 Videos: Added 2.3.1, 2.3.2, and 2.3.3

Unfortunately, the three videos 2.3.1, 2.3.2, and 2.3.3 for the upcoming LAB were not linked properly in the materials section. They have been added now. There won't be questions about these videos in the upcoming mini-test.   

06.11.2017

Project: Milestone 1

Dear students,

The first milestone has just been published. You can find the assignment in the materials section under Milestone 1. Please read the assignment and follow the instructions carefully.

You have already been granted access to your project... Read more

Dear students,

The first milestone has just been published. You can find the assignment in the materials section under Milestone 1. Please read the assignment and follow the instructions carefully.

You have already been granted access to your project repository on Gogs. Simply sign in, and you should see that you are now a collaborator of a project "Infosys/TeamXYZ". Note: If you do not have access to the project repository yet, that is because you did not follow the instructions in the Project Information sheet. You must sign in to Gogs before we can assign you to your repository. Please do not respond to this notification by email, and just drop by the programming lab tomorrow.

May you do well! ;)

06.11.2017

Mini-Test 2 Results

The results of mini-test 2 are available.

30.10.2017

Mini-Test 1 Results

The results of mini-test 1 are available. As video 0.2.1 (A Footnote about the Young History of Database Systems) was not linked in CMS, we decreased the max. points to 10 with 2 achievable bonus points.

Grading scheme: If a question has 1 correct answer and 3... Read more

The results of mini-test 1 are available. As video 0.2.1 (A Footnote about the Young History of Database Systems) was not linked in CMS, we decreased the max. points to 10 with 2 achievable bonus points.

Grading scheme: If a question has 1 correct answer and 3 wrong answers, then the correct answer gives +1 points whereas each of the wrong answers -1/3 points. Besides, you can not drop below 0 points per question.   

30.10.2017

Programming Group Registration till November 05

Please register your programming groups till November 05 by following the instructions in the Project Information Sheet

25.10.2017

Discourse Forum

The Discourse Forum is now available. You can use it to form project teams, ask questions related to the course, etc.

23.10.2017

Project Information

We have uploaded a PDF document with informations on the DBMS project in the materials section. It includes instructions on how to form project teams.

Show all
 

Patterns in Data Management Book

The Patterns in Data Management book is available on amazon! Both the ebook and the paperback (with color graphics!) are now available at amazon: Patterns in Data Management: A Flipped Textbook (English Edition). The videos that were used in this class are summarized in this book. A previous version of the book was distributed to my students of this class. Thanks for your positive feedback!

Overview

We are flooded with data be it data on the Web (html pages, twitter, facebook, map services, ...), structured data in databases (your money on bank accounts, addresses, cell phone data, school and uni grades, flight information, taxes, medical records, ...), or data in scientific applications (gene data in bioinformatics, telescope data in astronomy, collider data in physics, measurements of seismic activity in geology, ...).
 
DBMS Core Lecture
The way we access, manage, and process that data has tremendous impact on:
 
  1. performance. Though we sometimes think that a performance problem is due to particular algorithm requiring too much CPU time, it is often the data access patterns and retrieval times that slow down a program. The reason for bad performance may be that data cannot be accessed and shipped fast enough to the CPU. For instance you may be using unsuitable access methods to retrieve a single piece of information from a large data repository. Or you might be using an inefficient data layout ignoring the memory hierarchy and hardware capabilities of modern processors. In addition, even if the data was efficiently retrieved, performance may suffer due to picking the wrong analytical algorithms or not scaling your system correctly.
     
  2. reliability. What happens if your hard disk fails or your data center is flooded with water? How do you make sure that a consistent version of your data is accessible at all times? Can you afford to lose all data? How do you exploit multi-threading for accessing data without corrupting you data repository?

If you are interested in these questions, this might be the right lecture for you. 

In this core lecture you will learn how to answer these questions. You will learn fundamental data managing algorithms and techniques which are used to build (not only) database systems but also search engines (Google), file systems, data warehouses, publish/subscribe systems (like Twitter), streaming systems, map services (google maps), or Amazon's Cloud (EC2), etc. 

These techniques and algorithms will allow you to design, plan, and build (almost) any kind of data managing system.

Administrative / Organisation

When: Thursday 10:15-12:00 and Tuesday 14:15-16:00 (E1 3, HS 002)
Administrative Details: https://infosys.cs.uni-saarland.de/teaching/ws17/dbms_admin_notes.pdf

Topics

  • data managing architectures (DBMS, file systems)
  • storage media and hierarchies (disk, flash, main memory, caches, NUMA)
  • storage management (DB-file systems, raw devices, write-strategies, differential files, buffer management, RAID, COW, MOW, virtual memory)
  • data layouts (horizontal and vertical partitioning, row stores, columns stores, hybrid mappings, PAX; fractal design, compression, defragmentation)
  • indexing (one- and multidimensional, tree-structured, hash-, partition-based, B-trees, bulk-loading and external sorting, differential indexing, read- and write-optimized indexing, main-memory indexes, covering, composite, sparse and dense, direct and indirect, clustered and unclustered, main memory versus disk and/or flash-based, bitmaps)
  • query processing algorithms (join algorithms for relational, and spatial data, grouping and early aggregation, co-grouping, filtering, external sorting and partitioning)
  • query optimization (query rewrite, cost models, cost-based optimization, join order, plan enumeration)
  • query processing models (operator, interpreted, vectorized, compiled, anti-projection, tuple reconstruction)
  • concurrency control (MVCC)
  • data recovery (single versus multiple instance, logging, main-memory vs disk, ARIES)
  • parallelization of data and queries (horizontal and vertical partitioning, shared-nothing, replication, NoSQL, Apache Spark, Apache Flink)


If you encounter technical problems, please contact the administrators