Exponential growth and increasing complexity of the collected information has called for new tools and techniques to study the data and make informed decisions based on the results. Not surprisingly, last several years we observe a rapid development of plat- forms and methods to collect/extract (e.g., Elastic Beat, rsyslog), transform (e.g., Elastic Logstash, Apache NiFi), load (e.g., Apache Flume, syslog-ng), store (e.g., HDFS, Elasticsearch), process (e.g., Apache Flink, Apache Ignite), analyze (e.g., Apache Spark, Apache Hive) and visualize (e.g., Elastic Kibana, Graphana) big data. Unfortunately, the majority of these tools still requires development skills, therefore, they are not avail- able to more general public. Luckily, some companies, for instance, Elastic, have under- stood the need for tools that can be used to analyze data by non-developers, and started to provide corresponding tools, e.g., Kibana in case of Elastic, within their solutions. Kibana is a tool that can be easily configured to analyze and visualize the data stored_x000D_
in Elasticsearch by the users who do have limited to none knowledge in programming. Still, the functionality of Kibana in terms of visualizing multidimensional dependencies in data is still very limited._x000D_
In this thesis, we are doing a step in the direction to address this issue. We present_x000D_
a tool for multidimensional data exploration called Insight, which we have developed_x000D_
as a plugin for Kibana. Insight provides users with a possibility to explore the connections between multiple attributes of the data. Insight relies on two main data analysis principles: 1) the Pareto rule – 20% of the effort gives 80% of the result – therefore,_x000D_
the analysis of heavy-hitters provides you very valuable insights about the data; 2) the interconnection between attributes values allows analysts to draw conclusions about algorithms for data clustering. In this thesis, we explain in details how Insight is implemented, what components it consists of, and how to use it to explore the data contain- ing multiple attributes. Insight can be used by analysts as a separate graph for initial data exploration, as well as a part of a Kibana dashboard because it is tightly integrated with all Kibana components, namely filters, query string and time picker. We show the value of Insight on several use-cases and compare it with other available visualizations to analyze multidimensional data.
| Date of Award | 2018 |
|---|
| Original language | American English |
|---|
| Awarding Institution | - HBKU College of Science and Engineering
|
|---|
- Cybersecurity
- Data Science
- Elasticsearch
- Kibana
- Visualization
Insight: A Kibana Visualization Tool for Multidimensional Data Exploration
Shoeb, T. (Author). 2018
Student thesis: Master's Dissertation