python画交互式地图_使用Python构建交互式地图-入门指南
python畫交互式地圖
Welcome to The Beginner’s Guide to Building Interactive Maps in Python
歡迎使用Python構(gòu)建交互式地圖的初學者指南
In this post, I would like to show you how to create interactive climate maps using the Historical Climate Data, where you can visualize, examine, and explore the data. Data visualization plays an important role in representing data. Creating visualizations helps to present your analysis in an easier form of understanding. Especially when working with large datasets it is very easy to get lost, that’s when we can see the power of data visualization. In this exercise, we will work with climate data from Kaggle. We will build two interactive climate maps. The first one will be showing the climate change of each country, and the second one will be showing the temperature change over time. Let’s get started, we have a lot to do!
在本文中,我想向您展示如何使用歷史氣候數(shù)據(jù)創(chuàng)建交互式氣候圖,您可以在其中可視化,檢查和探索數(shù)據(jù)。 數(shù)據(jù)可視化在表示數(shù)據(jù)中起著重要作用。 創(chuàng)建可視化有助于以一種更容易理解的方式呈現(xiàn)您的分析。 特別是在處理大型數(shù)據(jù)集時,很容易迷失方向,這就是我們可以看到數(shù)據(jù)可視化的強大功能。 在本練習中,我們將使用來自Kaggle的氣候數(shù)據(jù)。 我們將構(gòu)建兩個交互式氣候圖。 第一個顯示每個國家的氣候變化,第二個顯示隨著時間的溫度變化。 讓我們開始吧,我們還有很多事要做!
Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals.
Kaggle是全球最大的數(shù)據(jù)科學社區(qū),其功能強大的工具和資源可幫助您實現(xiàn)數(shù)據(jù)科學目標。
目錄: (Table of Contents:)
- Plotly 密謀
- Understanding the Data 了解數(shù)據(jù)
- Data Cleaning 數(shù)據(jù)清理
- Data Preprocessing 數(shù)據(jù)預(yù)處理
- Data Visualization 數(shù)據(jù)可視化
密謀 (Plotly)
Plotly is Python graphing library that makes interactive, publication-quality graphs. Examples of how to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts. It is also an open-source library.
Plotly是Python圖形庫,可制作交互式的,具有出版質(zhì)量的圖形。 有關(guān)如何制作折線圖,散點圖,面積圖,條形圖,誤差線,箱形圖,直方圖,熱圖,子圖,多軸圖,極坐標圖和氣泡圖的示例。 它也是一個開源庫。
To learn more about Plotly: Plotly Graphing Library
要了解有關(guān)Plotly的更多信息: Plotly Graphing Library
了解數(shù)據(jù) (Understanding the Data)
The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied.
伯克利地球表面溫度研究結(jié)合了16個現(xiàn)有檔案中的16億個溫度報告。 它包裝精美,可以切成有趣的子集(例如,按國家/地區(qū))。 他們?yōu)閼?yīng)用的轉(zhuǎn)換發(fā)布源數(shù)據(jù)和代碼。
Dataset can be found at the following link: Climate Data
可以在以下鏈接中找到數(shù)據(jù)集: 氣候數(shù)據(jù)
The data folder includes the following datasets:
數(shù)據(jù)文件夾包含以下數(shù)據(jù)集:
- Global Average Land Temperature by Country 全球平均陸地溫度(按國家)
- Global Average Land Temperature by State 全球各州平均陸地溫度
- Global Land Temperatures By Major City 主要城市的全球陸地溫度
- Global Land Temperatures By City 全球城市氣溫
- Global Land and Ocean-and-Land Temperatures 全球陸地和海洋和陸地溫度
We will be working with the “Global Average Land Temperature by Country” dataset, this data fits better for our goal because we are going to build interactive climate maps, and having a data filtered by country will make our life much easier.
我們將使用“按國家/地區(qū)劃分的全球平均陸地溫度”數(shù)據(jù)集,此數(shù)據(jù)更適合我們的目標,因為我們將構(gòu)建交互式氣候圖,并且按國家/地區(qū)過濾數(shù)據(jù)將使我們的生活變得更加輕松。
圖書館 (Libraries)
We will need three main libraries to get started. When we come to visualization I will ask you to import a couple more sub-libraries, which are also known as library components. For now, we are going to import the following libraries:
我們將需要三個主要的庫來開始。 進行可視化時,我將要求您導(dǎo)入幾個子庫,這些子庫也稱為庫組件。 現(xiàn)在,我們將導(dǎo)入以下庫:
import numpy as npimport pandas as pd
import plotly as py
If you don’t have these libraries, don’t worry. It is super easy to install them, as you can see below:
如果您沒有這些庫,請不要擔心。 安裝它們非常容易,如下所示:
pip install numpy pandas plotly讀取數(shù)據(jù) (Read Data)
df = pd.read_csv("data/GlobalLandTemperaturesByCountry.csv")print(df.head())head頭 print(df.tail())tail尾巴 # Checking the null values in each columndf.isnull().sum()nulls空值
數(shù)據(jù)清理 (Data Cleaning)
Data Science is more about understanding the data, and data cleaning is a very important part of this process. What makes the data more valuable depends on how much we can get from it. Preparing the data well will make your data analysis results more accurate.
數(shù)據(jù)科學更多地是關(guān)于理解數(shù)據(jù)的,數(shù)據(jù)清理是此過程中非常重要的一部分。 什么使數(shù)據(jù)更有價值取決于我們可以從中獲得多少。 做好數(shù)據(jù)準備將使您的數(shù)據(jù)分析結(jié)果更加準確。
Let’s start with cleaning process. Firstly, let’s start by dropping the “AverageTemperatureUncertainty” column, because we don’t need it.
讓我們從清潔過程開始。 首先,讓我們開始刪除“ AverageTemperatureUncertainty ”列,因為我們不需要它。
df = df.drop("AverageTemperatureUncertainty", axis=1)Then, let’s rename the column names to have a better look. As you can see above, we are using a method called rename. Isn’t that cool how easy to rename a column name.
然后,讓我們重命名列名稱以使其外觀更好。 如您在上面所看到的,我們正在使用一種稱為重命名的方法。 重命名列名稱的難易程度不是很酷。
df = df.rename(columns={'dt':'Date'})df = df.rename(columns={'AverageTemperature':'AvTemp'})
Lastly for the data cleaning, let’s drop the rows with the null values so that they don’t effect our analysis. As we checked earlier, we have around 32000 rows with null values in AverageTemperature column. And in total we have around 577000 rows, so dropping them is not a big deal. But in some cases, there are a couple other methods to handle null values.
最后,為了進行數(shù)據(jù)清理,讓我們刪除具有空值的行,以免影響我們的分析。 正如我們之前所檢查的,AverageTemperature列中大約有32000行具有空值。 總共有大約577000行,因此刪除它們并不是什么大問題。 但是在某些情況下,還有其他幾種方法可以處理空值。
df = df.dropna()Now, let’s have a look at our dataframe. I will print the first 10 rows using the head method.
現(xiàn)在,讓我們看一下我們的數(shù)據(jù)框。 我將使用head方法打印前10行。
df.head(10)result結(jié)果數(shù)據(jù)預(yù)處理 (Data Preprocessing)
This step is also known as data manipulation, where we filter the data so that we can focus on a specific analysis. Especially when working with big datasets, data preprocessing/ filtering is a must. For example, our historical climate data is showing temperatures for all 12 months between 1744 to 2013, so it’s actually a very wide range. Using data filtering techniques, we will focus on a smaller range like between 2000 to 2002.
此步驟也稱為數(shù)據(jù)處理,其中我們對數(shù)據(jù)進行過濾,以便我們可以專注于特定的分析。 特別是在處理大型數(shù)據(jù)集時,必須進行數(shù)據(jù)預(yù)處理/過濾。 例如,我們的歷史氣候數(shù)據(jù)顯示了1744年至2013年之間的所有12個月的溫度,因此實際上范圍很廣。 使用數(shù)據(jù)過濾技術(shù),我們將專注于較小的范圍,例如2000到2002年之間。
比較運算符 (Comparison Operators)
- < <
- > >
- <= <=
- >= > =
- == ==
- != !=
We will use these operators to compare a specific value to values in the column. The result will be a series of booleans: True and Falses. True if the comparison is right, false if the comparison is not right.
我們將使用這些運算符將特定值與列中的值進行比較。 結(jié)果將是一系列布爾值:True和Falses。 如果比較正確,則為true;如果比較不正確,則為false。
分組依據(jù) (Grouping by)
In this step, we are grouping the dataframe by Country name and the date columns. And also, sorting the values by date from latest to earliest time.
在此步驟中,我們將按國家/地區(qū)名稱和日期列對數(shù)據(jù)框進行分組。 而且,還可以按日期從最晚到最早的時間對值進行排序。
df_countries = df.groupby( ['Country','Date']).sum().reset_index().sort_values('Date', ascending=False)result結(jié)果
屏蔽數(shù)據(jù)范圍 (Masking by the data range)
start_date = '2000-01-01'end_date = '2002-01-01' mask = (df_countries['Date'] > start_date) & (df_countries['Date'] <= end_date) df_countries = df_countries.loc[mask] df_countries.head(10)result結(jié)果
As you can see above, the dataframe is looking great. Sorted by date and filtered by country name. We can find the average temperature in each month of each country by looking at this dataframe. Here comes the fun part, which is data visualization. Are you ready?
正如您在上面看到的,數(shù)據(jù)框看起來很棒。 按日期排序并按國家/地區(qū)名稱過濾。 通過查看此數(shù)據(jù)框,我們可以找到每個國家/地區(qū)每個月的平均溫度。 這是有趣的部分,它是數(shù)據(jù)可視化。 你準備好了嗎?
數(shù)據(jù)可視化 (Data Visualization)
情節(jié)的組成 (Components of Plotly)
Before we start, as mentioned earlier there are couple sub-libraries to import to enjoy data visualization. These sub-libraries are also known as Components.
在開始之前,如前所述,有幾個子庫需要導(dǎo)入才能享受數(shù)據(jù)可視化。 這些子庫也稱為組件。
#Plotly Componentsimport plotly.express as pximport plotly.graph_objs as go
from plotly.subplots import make_subplots
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
氣候變化圖 (Climate Change Map)
Perfect, now by running the following code you will see the magic happening.
完美,現(xiàn)在通過運行以下代碼,您將看到魔術(shù)的發(fā)生。
#Creating the visualizationfig = go.Figure(data=go.Choropleth( locations = df_countries['Country'], locationmode = 'country names', z = df_countries['AvTemp'], colorscale = 'Reds', marker_line_color = 'black', marker_line_width = 0.5, ))fig.update_layout( title_text = 'Climate Change', title_x = 0.5, geo=dict( showframe = False, showcoastlines = False, projection_type = 'equirectangular' ) ) fig.show()climate change interactive map氣候變化互動地圖
氣候變化的時間表 (Climate Change by Timeline)
# Manipulating the original dataframedf_countrydate = df_countries.groupby(['Date','Country']). sum().reset_index() #Creating the visualization
fig = px.choropleth(df_countrydate, locations="Country", locationmode = "country names", color="AvTemp", hover_name="Country", animation_frame="Date" ) fig.update_layout( title_text = 'Average Temperature Change', title_x = 0.5, geo=dict( showframe = False, showcoastlines = False, )) fig.show()
結(jié)果 (Results)
Both are the same map, in the first one you can see the change in average temperature. And in the second graph, I am just hovering over some countries, which is showing more detailed information about each of them.
兩者是同一張圖,在第一個圖中,您可以看到平均溫度的變化。 在第二張圖中,我只是將鼠標懸停在某些國家/地區(qū)上,該國家/地區(qū)顯示了有關(guān)每個國家/地區(qū)的更詳細的信息。
interactive map 1互動地圖1 interactive map 2互動地圖2Thank you for reading this post, I hope you enjoyed and learn something new today. Feel free to contact me through my blog if you have any questions while implementing the code. I will be more than happy to help. You can find more posts I’ve published related to Python and Machine Learning. Stay safe and happy coding!
感謝您閱讀這篇文章,希望您今天喜歡并學到一些新東西。 如果在實施代碼時有任何疑問,請隨時通過我的博客與我聯(lián)系 。 我將非常樂意提供幫助。 您可以找到我發(fā)布的更多有關(guān)Python和機器學習的文章。 保持安全快樂的編碼!
I am Behic Guven, and I love sharing stories on creativity, programming, motivation, and life.
我是Behic Guven,我喜歡分享有關(guān)創(chuàng)造力,編程,動力和生活的故事。
Follow my blog and Towards Data Science to stay inspired.
關(guān)注 我的博客 和 邁向數(shù)據(jù)科學 ,保持靈感。
相關(guān)文章 (Related Posts)
翻譯自: https://towardsdatascience.com/building-interactive-maps-in-python-the-beginners-guide-5711dd66257e
python畫交互式地圖
總結(jié)
以上是生活随笔為你收集整理的python画交互式地图_使用Python构建交互式地图-入门指南的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 苹果推进隐私保护 中国用户可开启高级数据
- 下一篇: 大疆 机器学习 实习生_我们的数据科学机