Analytics and Dashboards with ElasticSearch and Kibana, setup and link to a demo

Today we will have a look at how to setup an analytics dashboard with the Elastic suite, specifically at Elasticsearch, Kibana and Logstash.

The Elastic suite is a set of applications made to provide everything needed to build a data analysis system fully integrated with existing software, be it an ecommerce like Salesforce or a custom-made platform, like Django or Rails-based applications.

One quick warning. Elastic is more suited for complex systems that are made of many applications or services and it may be overkill if you have a single application. This is because configuring it is more complicated than other tools like Superset. If you have a number of services and want to collect data at a single point or need to scale, Elastic becomes your best option and quickly pays back the effort in configuration and setup. On the other hand, you could use ElasticCloud which is super easy to setup even for less technical users, read more about this in the coming paragraph.

Elastic reports companies like USGS, Warner Brothers, Salesforce, and Netlifx are using the suite for business analytics and it’s not surprisings as these are all companies with huge amounts of data across various systems. For instance, Netlifx, in 2014, reported to be running 755 Elasticsearch nodes!

Elasticsearch, Kibana and ElasticCloud

While setting up ElasticSearch on your own might be overkill, it is important to point out that there is a service, ElasticCloud, that does that for you. With ElasticCloud you get access to a full fledged BI system with database (ElasticSearch), ETL (Logtash) and dashboards (Kibana) for a very reasonable price. You can get started with $40-50 per month.

Indeed, Kibana itself might be worth the trouble to explore ElasticCloud. Kibana is one of the most mature open source dashboarding offerings out there.

You can play with a demo of Kibana on our site here.

Considerations

This post is more focused on getting started with ElasticSearch if you want to put together your own installation. There are, however, some considerations in the final conclusion paragraph. If you have questions, please ask them in the comments, this will help us identify additional topics we need to cover for future articles.

Elasticsearch

Elasticsearch is an analytics engine and its the core of the suite. It’s powerful, scalable and comes with a rich HTTP API for querying and updating data.

Installing Elasticsearch

Elasticsearch requires Java 8 or higher or the equivalent OpenJDK version. From Debian or Ubuntu, you can install with:

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.2.deb
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.2.deb.sha512
shasum -a 512 -c elasticsearch-6.1.2.deb.sha512 
sudo dpkg -i elasticsearch-6.1.2.deb

Or on CentOs:

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.2.rpm
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.1.2.rpm.sha512
shasum -a 512 -c elasticsearch-6.1.2.rpm.sha512 
sudo rpm --install elasticsearch-6.1.2.rpm

You might need to start the service by hand:

sudo -i service elasticsearch start
Check Elasticsearch is running. You should receive an informative JSON response:
curl -X GET 'localhost:9200'

Ingestion

Let’s have a look at how we can ingest data into Elasticsearch through its RESTful API. We are going to use the Python requests library to interact with it:

pip install requests

For this example, I am also going to use an UN dataset on relative food prices, which you can download here.

Creating a mapping

Elasticsearch needs to know how our data is structured. To do this, we’ll create a mapping for food prices, declaring whatfields we expect to have.

import requests

mappings = {
  'mappings': {
    'doc': {
      'properties': {
        'area': {'type': 'text'},
        'country': {'type': 'keyword'},
        'value': {'type': 'float'},
        'year': {'type': 'integer'}
       }
     }
  }
}

requests.put('https://localhost:9200/foodprices', json=mappings)

Now we can send data to Elasticsearch:

import json

import requests

with open('foodprices.json', 'r') as f:
    data = json.load(f)

for row in data:
    requests.post('https://localhost:9200/foodprices/doc', json=row)

Retrieving data

The Elasticsearch API can be also used to retrieve data, for example if we wanted to retrieve food prices for France we could do this:

import requests

query = {
  "query": {
    "match": {
     "country": "France" 
    }
  }
}

requests.get('https://localhost:9200/foodprices/_search?pretty=true', json=query)

Logstash

Logstash is the ingestion (ETL) application. It’s a configurable system that can transform and filter data, before sending it to Elasticsearch. Logstash’s main purpose is to collect data from logs, but it works well with http requests also.

Most of the times, if you are relying on a major framework you should be able to find a plugin that takes care of sending data to Elasticsearch. In case you have a custom application with unusual data, or require some particular filtering options, you might need to use Logstash.

Installing

The installation process is similar to Elasticsearch’s. On Debian-based distros:

wget https://artifacts.elastic.co/downloads/logstash/logstash-6.1.2.deb
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.1.2.deb.sha512
shasum -a 512 -c logstash-6.1.2.deb.sha512 
sudo dpkg -i logstash-6.1.2.deb

Or on CentOs:

wget https://artifacts.elastic.co/downloads/logstash/logstash-6.1.2.rpm
wget https://artifacts.elastic.co/downloads/logstash/logstash-6.1.2.rpm.sha512
shasum -a 512 -c logstash-6.1.2.rpm.sha512 
sudo rpm --install logstash-6.1.2.rpm

Configuring

Logstash requires a configuration file: it needs to know how to parse the data it receives and where to send it. Depending on your system, you should have a logstash configuration folder. On Debian 9, it’s /etc/logstash, with logstash installed at /usr/share/logstash/bin/logstash

For example, to ingest http requests:

conf
input {
    http {
        host => "127.0.0.1"
        port => 8080
    }
}
output {
    elasticsearch { hosts => ["localhost:9200"] }
    stdout { codec => rubydebug }
}

Now we can run logstash:

/usr/share/logstash/bin/logstash -f logstash.conf

We can check if it’s working by sending a simple request:

import requests

requests.put('https://127.0.0.1:8080/hello', json={'message': 'send me to elastic'})
Then, Logstash will send to Elasticsearch this representation of the request:
{
       "message" => "send me to elastic",
      "@version" => "1",
    "@timestamp" => 2018-01-01T10:00:00.000Z,
          "host" => "127.0.0.1",
       "headers" => {
             "http_user_agent" => "python-requests/2.18.4",
                "http_version" => "HTTP/1.1",
                "content_type" => "application/json",
                 "http_accept" => "*/*",
                "request_path" => "/hello",
              "request_method" => "PUT",
             "http_connection" => "keep-alive",
                   "http_host" => "127.0.0.1:8080",
        "http_accept_encoding" => "gzip, deflate",
              "content_length" => "33",
                 "request_uri" => "/hello"
    }
}

In a production case you would tailor Logstash’s filters and transformations so that Elasticsearch is receiving only the data you want, in the correct format.

Kibana

Kibana is a dashboard made specifically for visualizing Elasticsearch data. Once linked to Elasticsearch it will automatically build real-time graphs and comparisons using Elasticsearch’s data. Kibana supports plugins, for example to create custom graph types.

Installing

The installation is similar to Elasticsearch and Logstash:

wget https://artifacts.elastic.co/downloads/kibana/kibana-6.1.2-amd64.deb
sha1sum kibana-6.1.2-amd64.deb 
sudo dpkg -i kibana-6.1.2-amd64.deb
Or on CentOs:
wget https://artifacts.elastic.co/downloads/kibana/kibana-6.1.2-x86_64.rpm
sha1sum kibana-6.1.2-x86_64.rpm 
sudo rpm --install kibana-6.1.2-x86_64.rpm

Let’s make sure its running:

sudo -i service kibana start

Kibana runs on port 5601, so it will run on https://localhost:5601

Creating a graph

Kibana also needs a bit of configuration, but is all done from the dashboard itself

Index patterns.
First, Kibana wants to know what to index from Elasticsearch. Let’s do it step by step:
1. Go to the management page and click on index patterns
enter image description here
2. On the index patterns page, page click on “create index pattern”
enter image description here
3. In the index pattern field, write “foodprices”. This is the name we used when setting up the Elastic mapping. Click the next step button
enter image description here
4. On the next page, we just need to confirm the creation. Click on the “create index pattern” button
enter image description here
5. Done! You should be back at the management page
enter image description here

Saved searches
Now we are going to create a search and save it.

  1. Go to the Discover page, and from the dropdown just right of the menu, select “foodprices”
    enter image description here
  2. Select the fields that we need. We will use _id, area, country, value and year
    enter image description here
  3. Save the search with the name “food prices”
    enter image description here

Creating the graph

  1. Go to the visualize page. Click on “create a visualization”
    enter image description here
  2. Choose the line chart from the charts list
    enter image description here
  3. Choose “foodprices*” from the search list on the left panel
    enter image description here
  4. Click on “add filter”, pick “country”, “is one of” and France and United States as countries
    enter image description here
  5. In Metrics, pick “average” and “value” for the y-axis. Label this axis as “Food price”
    enter image description here
  6. In Buckets, choose “histogram”, “year” and 1 as interval
    enter image description here
  7. Click on “add sub-buckets”, choose “split series” and filters as aggregation. Add two filters, country: “France” and country: “United States”
    enter image description here
  8. Click the play button at the top of the widget to update the graph
    enter image description here
  9. Save the visualization as “France vs USA”
    enter image description here
  10. Enjoy the insights!
    enter image description here

Final considerations

A Kibana demo is available here.

There are 2 ways to evaluate ElasticSearch and Kibana. If you are a solution architect or an IT professional, you will appreciate the “batteries included” solution that is offered here. With the Elastic stack you got coverage on the complete end to end process from databases, to ETL to dashboarding for your users. It is important to notice that Kibana is a business friendly application, so once you got the data there opening things up to business users will be easy.

If you are an analyst looking for a dashboard solution, you can consider using ElasticCloud which is very simple to setup and requires little administration. You can then use Python to analyse, transform and finally load your data into ElasticSearch for advanced dashboarding with Kibana. While you can code static charts in Python, the “point and click” nature of Kibana will speed up your analytics pipeline and help provide quality visualizations. With Kibana you can focus on your data analytics in Python because building reports and visualizations will be quick and easy (minutes vs hours with Python).

Leave a Reply

Your email address will not be published. Required fields are marked *