Before, some readers asked me if I have any useful BI (Business Intelligence) tools? BI tool is simply a data visualization tool. Today, I recommend an open-source data visualization tool DataEase, which is implemented based on SpringBoot and integrated with Apache Doris + Kettle. It can support second level query of large amount of data. I hope it will be helpful to you!
SpringBoot e-commerce project mall (50k+star) address: github.com/macrozheng/...
brief introduction
DataEase is an open source data visualization and analysis tool known to be available to everyone. There is already 4.1K+Star on Github. Committed to helping users quickly analyze data and insight into business trends, so as to achieve business improvement and optimization. DataEase supports rich data source connections, and can quickly make charts by dragging and dropping, and share them with others.
The following is a large visual screen generated by DataEase, which is still very cool.
framework
As a data visualization tool, DataEase uses the popular big data technologies Apache Doris and Kettle. If you want to learn these two technologies, this project is a good choice.
system architecture
The technology stack used by DataEase is as follows:
technology | explain |
---|---|
SpringBoot | Back end infrastructure |
MySQL | data storage |
Apache Doris | A modern MPP analytical database product. Query results can be obtained with sub second response time, which effectively supports real-time data analysis. |
Kettle | An open source ETL (i.e. the process of data extraction, conversion and loading) tool, written in pure Java, can achieve efficient and stable data extraction. |
Docker | Container deployment |
Vue | Front end foundation frame |
Element | Front end UI framework |
The usage scenarios of various technologies in DataEase are as follows:
Functional architecture
The following is the functional architecture diagram of DataEase, from which we can easily see what we can do with DataEase.
install
DataEase provides the installation package. Download the installation package and use the installation script install SH to complete the installation. If your server has MySQL installed, you need some additional configuration.
- First of all, we need to download the installation package. V1.0 is used here Version 5.2, download address: github.com/dataease/da...
- After downloading, upload it to the Linux server and unzip it to the specified directory with the following command;
tar -zxvf dataease-v1.5.2-online.tar.gz Copy code
- After decompression, the directory structure is as follows. Note that there is docker compose deployment script under the dataease folder;
- Next, modify the installation configuration install Conf, which mainly modifies the service running port DE_PORT and MySQL configuration;
# Basic configuration ## Installation directory DE_BASE=/opt ## Service port (default 80, probable conflict rate) DE_PORT=8010 # Database configuration ## Use external database DE_EXTERNAL_MYSQL=false ## Database address (the default is mysql. If you have installed MySQL with docker before, it is recommended to modify it) DE_MYSQL_HOST=mysql-de ## Database port (3306 by default. If you have installed mysql with docker before, it is recommended to modify it) DE_MYSQL_PORT=3307 ## DataEase database library name DE_MYSQL_DB=dataease ## Database user name DE_MYSQL_USER=root ## Database password DE_MYSQL_PASSWORD=Password123@mysql Copy code
- Modify the docker compose file of DataEase to DataEase / docker compose YML, modify the MySQL dependency name and network configuration. The default network configuration may cause conflicts;
services: dataease: image: registry.cn-qingdao.aliyuncs.com/dataease/dataease:v1.5.2 container_name: dataease ports: - ${DE_PORT}:8081 mem_limit: 4096m volumes: - ${DE_BASE}/dataease/conf:/opt/dataease/conf - ${DE_BASE}/dataease/logs:/opt/dataease/logs - ${DE_BASE}/dataease/plugins/thirdpart:/opt/dataease/plugins/thirdpart - ${DE_BASE}/dataease/data/kettle:/opt/dataease/data/kettle depends_on: # If you have previously installed mysql using Docker, modify the name mysql-de: condition: service_healthy networks: - dataease-network networks: dataease-network: driver: bridge ipam: driver: default # The default network segment configuration may conflict. It is recommended to modify it config: - subnet: 172.33.0.0/16 gateway: 172.33.0.1 Copy code
- Modify Doris's docker compose file to dataease / docker compose kettle Doris YML, which mainly modifies the network configuration;
version: '2.1' services: doris-fe: image: registry.cn-qingdao.aliyuncs.com/dataease/doris:0.15 container_name: doris-fe networks: # Change to 33 network segments to prevent conflicts dataease-network : ipv4_address: 172.33.0.198 restart: always doris-be: image: registry.cn-qingdao.aliyuncs.com/dataease/doris:0.15 networks: # Change to 33 network segments to prevent conflicts dataease-network : ipv4_address: 172.33.0.199 restart: always Copy code
- Modify the docker compose file of MySQL to dataease / docker compose MySQL YML, just modify the container name;
version: '2.1' services: mysql-de: image: registry.cn-qingdao.aliyuncs.com/dataease/mysql:5.7.36 # You have previously installed mysql using Docker, and you need to modify the container name container_name: mysql-de env_file: - ${DE_BASE}/dataease/conf/mysql.env ports: - ${DE_MYSQL_PORT}:3306 volumes: - ${DE_BASE}/dataease/conf/my.cnf:/etc/mysql/conf.d/my.cnf - ${DE_BASE}/dataease/bin/mysql:/docker-entrypoint-initdb.d/ - ${DE_BASE}/dataease/data/mysql:/var/lib/mysql networks: - dataease-network Copy code
- If you enable firewall, you should also open port 8010;
firewall-cmd --zone=public --add-port=8010/tcp --permanent firewall-cmd --reload Copy code
- When everything is ready, run install. In the installation directory directly SH file for installation;
./install.sh Copy code
- The installation process involves downloading the image, which takes a long time and needs to wait patiently. After the final installation is successful, it is displayed as follows;
➜ dataease-v1.5.2-online ./install.sh Stopping doris-fe ... done Stopping doris-be ... done Stopping kettle ... done Removing doris-fe ... done Removing doris-be ... done Removing kettle ... done Removing network dataease_dataease-network ======================= Start installation ======================= [DATAEASE Log]: Copy profile template file -> /opt/dataease/conf [DATAEASE Log]: Adjust the configuration file according to the installation configuration parameters time: Wed Dec 22 10:59:39 CST 2021 /usr/sbin/getenforce [DATAEASE Log]: Detected Docker Installed, skipping installation steps [DATAEASE Log]: start-up Docker Redirecting to /bin/systemctl start docker.service [DATAEASE Log]: Detected Docker Compose Installed, skipping installation steps [DATAEASE Log]: Pull image Pulling doris-be ... done Pulling kettle ... done Pulling mysql-de ... done Pulling dataease ... done Pulling doris-fe ... done ...Omit several logs Name Command State Ports ----------------------------------------------------------------------------------------------------- dataease /deployments/run-java.sh Up (health: starting) 0.0.0.0:8010->8081/tcp doris-be /entrypoint.sh Up (healthy) doris-fe /entrypoint.sh Up (health: starting) kettle /opt/kettle/carte.sh kettl ... Up mysql-de docker-entrypoint.sh mysqld Up (healthy) 0.0.0.0:3306->3306/tcp, 33060/tcp [DATAEASE Log]: Please wait while the service starts ... [DATAEASE Log]: Please wait while the service starts ... [DATAEASE Log]: [Warning] the service is not fully started within the waiting time! Please use later dectl status Check service health. ======================= installation is complete ======================= Please visit: URL: http://$LOCAL_IP:8010 user name: admin Initial password: dataease Copy code
- Since we have modified the MySQL configuration, we also need to modify the MySQL connection configuration under the installation directory / opt. The file path is / opt / dataease / conf / dataease Properties, changed to MySQL de;
# Database configuration spring.datasource.url=jdbc:mysql://mysql-de:3306/dataease?autoReconnect=false&useUnicode=true&characterEncoding=UTF-8&characterSetResults=UTF-8&zeroDateTimeBehavior=convertToNull&useSSL=false Copy code
- Then restart the dataease container;
docker restart dataease Copy code
- When restarting, use docker logs -f dataease to view the log. The project is started successfully only after the database import is completed;
- Since DateEase will automatically register the dataease service in the system after successful installation, we can use the following command to operate it.
# View service status systemctl status dataease # Start service systemctl start dataease # Out of Service systemctl stop dataease Copy code
use
Data visualization can be easily realized by using DataEase. Next, let's take the data in Excel and MySQL as examples to experience its functions.
Basic concepts
Before using DataEase, we have to understand some of its basic concepts, which will be very helpful to use it.
- Data source: it is the data source for subsequent data analysis. It refers to various database connection information and supports common data sources such as MySQL, Elasticsearch and MongoDB;
- Dataset: a collection of data, including Excel data, database table data, and custom SQL query data. It is the data source of the view;
- View: the smallest unit of visual display, which is the basic element of the dashboard. It can be line chart, bar chart, pie chart, etc;
- Dashboard: large visual screen, view combination interface;
- Template: data and style templates that can be used to quickly build dashboards.
Excel data analysis
Next, we will get data from Excel and implement the dashboard to experience the data visualization function of DataEase.
- After DataEase is started successfully, you can log in with the account admin:dataease at: http://192.168.3.105:8010/
- Since we have previously modified the name of MySQL container, we also need to modify the data source here;
- Next, we need to create a data set and use the official sample Excel. After downloading, you can open and have a look at a commodity sales report. The download address is: dataease.io/docs/manual...
- Then select Add dataset;
- Upload Excel when creating a new one, and finally select OK to import;
- Due to the previous modification of Doris's network segment, the imported Excel data will not be displayed, and the following error prompt will pop up;
- Enter the MySQL de container and enter the following command to solve the problem;
# Enter the built-in MySQL container docker exec -it mysql-de sh # After entering the MySQL container, connect Doris Fe mysql -uroot -h doris-fe -P 9030 # Because the network segment of doris is modified, it should also be modified here ALTER SYSTEM ADD BACKEND "172.33.0.199:9050"; SET PASSWORD FOR 'root' = PASSWORD('Password123@doris'); CREATE DATABASE dataease; Copy code
- After the data is imported successfully, you can start to create a view and select the dataset we just imported;
- Then select the type of view. Here, select the pie chart representing the distribution;
- Drag and select dimensions and indicators, then change the style, and finally save to complete a view;
- Create a few more views, and then you can create a dashboard. By dragging and editing, the dashboard is completed. Isn't it very convenient!
Database data analysis
Of course, DataEase also supports importing data from the database and even customizing SQL queries. Let's experience these functions.
- First, we have to create a new data source. You can select various types of data sources. There are many supports. Here, choose MySQL;
- Then create a dataset and select Add dataset from database;
- Then create a view and use the dataset created above;
- Of course, you can also customize SQL queries to add data sets;
- DataEase also has a powerful function. You can set each view to be linked directly according to a field. For example, in the official example, if we select a province, the data of other views will become the data of this province;
- Another interesting function is drill down. For example, if we select a province to drill down, we can view the relevant data of cities in that province.
summary
In general, DataEase is a very good data visualization tool. It allows us to easily realize some data visualization requirements without writing code, and supports the analysis of data from various data sources and Excel. And it uses the popular big data analysis technologies Apache Doris and Kettle. Friends interested in these technologies can also try it.