Skip to main content

Customer Analytics

Virtuoso handling of customer data

Collecting customer data is getting easier and easier for companies. The current challenge, however, is the useful and profitable integration of this data into existing business processes.

Merging different data sources with Big Data data sources in a completely integrated environment such as Pentaho enables companies to gain a 360° view of their customers, on demand.

Today, 80% of all customer data comes from sources such as social media, blogs, forums, tweets, web shops and internet sites. Combining, enriching and permanently associating such structured and unstructured data from the Big Data pool with customers provides new insights into customer behaviour.

Having a 360° view means constantly keeping all data from all sources up to date and to a predefined quality and making this information available for use. The following information categories are of particular interest in customer analytics:

  • Customer characteristics (preferences, needs, desires)
  • Customer interactions (offers, click streams, notes)
  • Customer activity data (orders, payments, length of stay)
  • Data that describes the customer (special features, self-evaluations, demographics)

Companies that collect data aim to know "everything about each customer". The benefits of having such a 360° view of a customer can only be reaped, however, if we succeed in converting this knowledge into target-oriented measures and actions for every customer interaction. This can be achieved with customer analytics.

Practical tips for implementing a customer analytics project

Before any data can be analysed, it must first be extracted from the various source systems. And when consolidating data, it is important to remember that it can sometimes exist in heterogeneous structures. This is why structured data from operational systems such as a CRM system has to be combined with unstructured data from social media platforms, which in turn makes data processing and cleansing more complex.

Also, it should be ensured that customer data is always up to date so that up-to-date decisions can be made. It may even be necessary to process data in real time. It's also worth bearing in mind that in addition to the increased demands on data diversity and speed of processing, high data volumes also present a particular challenge to databases.

In order to deal successfully with such data volumes, an efficient data storage system – one that is scalable even with increasing data volumes – must be implemented. Predictive analytics models that recognise patterns in the data stock are necessary to be able to derive forecasts for future developments from the data pool.

Ultimately, it must also be possible to visualise the data volumes in a suitable form as part of any data analyses. To this end, dashboards or reports are available that, at a glance, provide users with information about customer behaviour.

Specific recommendations on technology for customer analytics use cases

The it-novum solution for realising customer analytics is a combination of the Pentaho BI Suite and services from the Hadoop ecosystem, with Pentaho (PDI) acting as the central control unit and interface to Hadoop. This enables interaction with Hadoop services such as HDFS or Spark to be simplified so that any programming effort involved is reduced to a minimum.

Extracting data from source systems

As mentioned above, data first has to be extracted from the source systems before it can be analysed. This is where Sqoop helps load structured data from relational tables into the Hadoop Cluster. For streaming applications, Flume, Kafka or Spark Streaming are available for accessing data in real time.

Consolidating data

After the data has been extracted successfully, it has to be consolidated. For the pre-processing and clean-up work, Hadoop uses frameworks such as Spark and MapReduce, which make it possible to realise distributed and scalable calculations across the entire cluster.

Storing data

When it comes to data storage, Hadoop provides services such as HDFS, HBase and Kudu for storing data across multiple cluster nodes. In doing this, special data formats such as Parquet or Avro are available that can efficiently compress data to save storage space.

Analysing data

The Spark and MapReduce frameworks can also be used to perform analyses on the files stored in the Hadoop Cluster. Hive and Impala also allow access via SQL queries – a functionality that allows you to perform analyses without the need for any programming.

Predictive analyses

If a predictive analytics application is required in conjunction with Hadoop, the Cloudera Data Science Workbench is recommended. It provides users with intuitive access to cluster resources as well as a development environment for various data science platforms such as Spark ML and TensorFlow.

Visualising data

The components of the Pentaho BI-Suite can be effectively used to visualise the data under review. Using the suite, individual reports and interactive dashboards can be created that support the user in evaluating customer data.

Benefits and applications

Three time-centred customer views

The Past: Evaluation and historicisation of past customer interactions

  • What did a particular customer buy and when?
  • Recognition of seasonal fluctuations e.g. regarding turnover
  • On-demand visualisation at a glance

    The Present: prompt reactions to current customer interactions

    • Collaborative filtering / recommender systems
                     o    Customers (with similar buying behaviours) also ordered...
                     o    Product X also goes well / works well with product Y...
                     o    Increase in cross-selling and upselling potential
    • Personalised offers tailored to the customer's needs
    • Improving customer service
    • Strengthening customer loyalty
    • Prioritisation of highly prized customers
    • Optimising customer acquisition
    • Improving brand perception

    The Future: Deriving future customer behaviour (predictive analytics)

    • Algorithmic pattern recognition based on historical data
    • Customer segmentation according to similarity of customer groups or other criteria (clustering)
    • Ranking the probability of customer churn (classification)
                     o    Proactive measures for keeping "at risk" customers
    • Forecasts on future customer sales developments (regression)
                     o    What is the expected next quarter turnover for customer X?

    The 360° view of the customer needs to be re-orchestrated

    Considering the broad range of sources as well as the large amount of data, a new approach to the technology-based "360° view of the customer" must be found.

    A crucial factor is having a professional master data management system for customer data. The use of metadata in this context is particularly important, because only by using this can customers be uniquely identified via the various contact channels. After all, customers want to know whether a site visitor, influencer, blogger or (re)tweeter is already known to them as a customer and, ideally, how profitable they are. To find this out, intelligent algorithms for customer identity resolution are needed, so that all available data – structured and unstructured – can be associated with the customer and analysed. This gives rise to the so-called "golden record": a dataset that covers all information about a customer and provides a 360° view of the customer at the customer data level.

    With the aid of customer master data management combined with customer analytics, the goal of making a digital customer's footprints readable and available for analysis can be achieved.

    Expert recommendation

    Before the data from the various sources is even processed, data quality management tools ensure that the data is clean and current. Right from the start, when information is being entered into the system, quality mechanisms are used that ensure the consistency of customer data across all data sources.

    Your advantages of having a 360° view of your customers

    Improved customer service and increased sales

    Lower churn rate

    Reduction in customer acquisition costs

    Increase in cross-selling and upselling potential

    Visible representation of how customers perceive your corporate brand

    Prioritisation of highly prized customers

    Our service: Providing added value with a Pentaho customer analytics solution

    • A single data pool containing all customer data makes fast queries possible
    • Business users find all key metrics in one central location
    • Joining up previously isolated data, avoiding selective integration
    • Combining traditional data sources with Big Data
    • Creating extensive analyses (visualisations, reports, dashboards, ad hoc analyses)
    • Embedded analytics, making operative information directly usable
    • Development of predictive analytics models for database pattern recognition