Data Visualization: “Master of None” in Data Analysis
Why We Need Data Visualization？
If you want to be a data analyst, mastering data visualization skills is essential, cause in most cases, the boss cares more about the results presented.
In addition, when these visualization results are presented in front of your eyes, you can intuitively experience the “beauty of data”. Pictures are far better than words in terms of content expression. It can not only reflect the authenticity of the data, but also give people a lot of imagination.
What are the Views of Data Visualization?
We have more than 10 kinds of visualization views commonly used, including text tables, maps, pie charts, horizontal bars, stacked bars, tree views, scatter plots, histograms, Gantt charts, scatter charts, bubble charts, etc.
Of course, you must not only master the use of these views, but also understand the purpose behind them. Here I divide them into the following 9 cases:
In the above cases, you may want to see the distribution of certain data or its trend over time, and so on. So before designing, you need to think about what you want to present to users, what characteristics of the data need to be highlighted, and what view to use for the presentation.
For example, if you want to show the distribution of a variable, you can show it in the form of a histogram. If you want to see the correlation and distribution between the two variables, you can use a scatter plot.
A view may have multiple expression purposes. For example, a scatter plot can both show the relationship between two variables and reflect their distribution. Similarly, if I want to see the distribution of the variables, I can use either a scatter plot or a histogram.
So, which view you want to use depends on what purpose you want the data to visualize.
What are the Data Visualization Tools?
So how to make the data into the views mentioned above? We need to use data visualization tools.
There are many of these tools, and I will introduce you from several dimensions.
Tableau is powerful in visual and flexible analysis, and its main users are professional data analysts. At the same time, it is highly used in work situations, so mastering Tableau is very helpful for promotion and job hunting. But Tableau is commercial software, and the fees are not low. There are some thresholds to get started, and a certain data foundation is required.
FineReport is completely free software for individual users. From visualization reports to dashboards, it can be done easily. And it has its professional solutions in many industries, which is also very easy to operate. FineReport can connect to business data in real-time and display them in time.
Front-end Visualization Components
The visualization components are based on web rendering technology. So you need to know a few typical web rendering technologies: Canvas, SVG, and WebGL. In simple terms, Canvas and SVG are the main 2D graphics technologies in HTML5, and WebGL is a 3D framework.
Canvas is suitable for bitmap, that is, it gives you a whiteboard, and you need to draw points by yourself. Canvas can draw complex animations. However, it comes with HTML5, so older browsers don’t support Canvas. ECharts is a visual component based on Canvas.
SVG uses XML format to define graphics. It is equivalent to using dots and lines to draw graphics. Compared to bitmaps, the file is relatively small, and any scaling will not be distorted. SVG is often used on icons and charts. Its biggest feature is that it supports most browsers, and dynamic interactivity is very easy to implement, such as inserting animation elements in SVG.
WebGL is a 3D drawing protocol that can render 3D picture technology in a web browser and can interact with users. Many of the cool 3D effects you see on web pages are basically rendered using WebGL. Three.js introduced below is based on the WebGL framework.
After understanding these web rendering protocols, let me take a look at these common visualization components: Echarts, D3, and Three.js.
Using data analysis tools, you must learn Python, of course, some people use the R. When using Python and R for data analysis, you must use the visualization part.
Let me briefly introduce how to use Python and R for data visualization.
Numerous visualization libraries are included in Python, such as Matplotlib, Seaborn, Bokeh, Plotly, Pyecharts, Mapbox, and Geoplotlib. The most frequently used ones are Matplotlib and Seaborn.
Matplotlib is a basic visualization library for Python. The plotting style is similar to MATLAB, so it is called Matplotlib. Generally learning Python data visualization starts from Matplotlib, and then learns other Python visualization libraries.
The picture below is the radar chart I made with Matplotlib.
Seaborn is an advanced visualization library based on Matplotlib. It has a more advanced package for Matplotlib to make mapping easier. You can use code to draw multi-dimensional data visualizations, such as the following example:
There are also many visualization libraries to choose from in R. These include the graphics package that comes with R and the toolkits ggplot2, ggmap, timevis, and plotly.
Among them, ggplot2 is an important plotting package in R. This toolkit separates data from plotting operations. The ggplot library has also been introduced in Python, so that ggplot can also be easily used in Python, and it is not much different from the ggplot2 code in R. With a little modification, it can be run directly in Python.
Today I introduced you to the view of data visualization, and then took you to explain the current mainstream data visualization tools. Tableau is a leader in the BI business intelligence industry and an essential tool for business data analysis in many large companies. FineReport is completely free for individual users and can make very cool big screens. If you use a programming language for data analysis and data visualization, then Python and R are also good choices.