Skip to main content

Data Warehouse

Data Warehouse in the age of Big Data

A Data Warehouse is a software architecture that for many years has been helping companies recover valuable knowledge from their different IT systems. The reality of the situation in which this technology is being implemented, however, has changed dramatically in recent years: These days, many companies are producing disproportionately more data and the associated reaction rates for analysing this information has been drastically shortened. Simultaneously, the thirst for knowledge on the part of companies and organisations has been increasing. This means classic Data Warehouse approaches are soon driven their limits. Big Data technologies promise to fulfil these new requirements and offer promising approaches to enhance and modernise the traditional Data Warehouse concept.

The classic Data Warehouse

The classic Data Warehouse

While a company's operational systems are focused on supporting the activities necessary for daily business, a Data Warehouse focuses on analysing and reporting on how the company is run. The technological base of a Data Warehouse system is the relational database management system (RDBMS). It can, therefore, be said that the use of relational databases and data marts for the most common Data Warehouse use cases is a good choice.

The limits of traditional Data Warehouses

The limits of traditional Data Warehouses

When faced with extremely high data volumes, scaling a Data Warehouse can be very difficult. And for companies using commercial database software may also result in high licensing costs. This acts as a deterrent for many companies, which is why they fail to analyse their data and make use of the knowledge it contains.

Because more and more data in non-standard formats is becoming the focus of analysis, relational databases can quickly be pushed to their limits.

To satisfy these new demands, new technologies are being brought into play.

Enhancing Data Warehouse with Hadoop, NoSQL & more

A number of technological approaches have been developed to overcome the limitations of the RDBM system: NoSQL databases, Apache Hadoop, and analytical databases.

Hadoop as a Data Warehouse platform

Hadoop as a Data Warehouse platform

Apache Hadoop takes up where other, traditional Data Warehouse systems have reached their limits. The essential problem with using conventional Data Warehousing technologies is the rapid rise in operational costs when processing large amounts of data. In addition, more and more unstructured data is being produced that just does not fit into the logic of a standard Data Warehouse. Hadoop is not a database. Instead, it consists of and relies on the distributed file system HDFS and the MapReduce framework for processing data.

Analytical databases

Analytical databases are a relatively simple and quickly implementable extension for a Data Warehouse system. Typical examples include databases from InfiniDB, Infobright, Vertica, and Vectorwise. There are also databases that are built upon relational database management systems (RDBMS), but have been optimised for fast queries.

Analytical databases use different technologies to speed up data processing.

These include:

  • Columns
  • Massive parallel processing (MPP)
  • Data compression
  • In-memory storage.

    Data Warehouse: Your benefits

    Comprehensive view of enterprise data

    Analyses of integrated sources of data

    Time savings through standardised access to data

    Automated and standardised determination of key metrics

    Knowledge gains by linking information (data mining)

    Historicising of corporate data for time series comparisons

    Our services: Design and modelling of a Data Warehouse

    As an experienced consulting company in the BI sector, we utilise powerful open source tools like Pentaho Data Integration (PDI), Jedox ETL and Talend to develop the extraction, transformation and loading (ETL) processes you need. At the database level, we use interesting, open-source technologies such as MySQL, PostgreSQL and Infobright that enable us to implement complex requirements for an analytical database.