超級商城每日周報: 入門｜數據科學初學者必知的NumPy基礎知識

摘要：本文介紹了一些NumPy基礎知識，適合數據科學初學者學習掌握。 NumPy（Numerical

Python）是Python中的一個線性代數庫。對每一個數據科學或機器學習Python

包而言，這都是一個

本文介紹了一些 NumPy 基礎知識，適合數據科學初學者學習掌握。

NumPy（Numerical

Python）是 Python 中的一個線性代數庫。對每一個數據科學或機器學習 Python

包而言，這都是一個非常重要的庫，SciPy（Scientific Python）、Mat-plotlib（plotting

library）、Scikit-learn 等都在一定程度上依賴 NumPy。

對數組執行數學運算和邏輯運算時，NumPy 是非常有用的。在用 Python 對 n 維數組和矩陣進行運算時，NumPy 提供了大量有用特徵。

這篇教程介紹了數據科學初學者需要了解的

NumPy 基礎知識，包括如何創建 NumPy 數組、如何使用 NumPy

中的廣播機制、如何獲取值以及如何操作數組。更重要的是，大家可以通過本文了解到 NumPy 在 Python

列表中的優勢：更簡潔、更快速地讀寫項、更方便、更高效。

本教程將使用 Jupyter notebook 作為編輯器。

讓我們開始吧！

安裝 NumPy

如果你已經裝有 Anaconda，那麼你可以使用以下命令通過終端或命令提示符安裝 NumPy：

conda install numpy

如果你沒有 Anaconda，那麼你可以使用以下命令從終端上安裝 NumPy：

pip install numpy

安裝好 NumPy 後，你就可以啟動 Jupyter notebook 開始學習了。接下來從 NumPy 數組開始。

NumPy 數組

NumPy 數組是包含相同類型值的網格。 NumPy 數組有兩種形式：向量和矩陣。嚴格地講，向量是一維數組，矩陣是多維數組。在某些情況下，矩陣只有一行或一列。

首先將 NumPy 導入 Jupyter notebook：

import numpy as np

從 Python 列表中創建 NumPy 數組

我們先創建一個 Python 列表：

my_list = [1, 2, 3, 4, 5]

通過這個列表，我們可以簡單地創建一個名為 my_numpy_list 的 NumPy 數組，顯示結果：

my_numpy_list = np.array(my_list)my_numpy_list #This line show the result of the array generated

剛才我們將一個 Python 列表轉換成一維數組。要想得到二維數組，我們要創建一個元素為列表的列表，如下所示：

second_list

= [[1,2,3], [5,4,1], [3,6,7]]new_2d_arr =

np.array(second_list)new_2d_arr #This line show the result of the array

generated

我們已經成功創建了一個有 3 行 3 列的二維數組。

使用 arange() 內置函數創建 NumPy 數組

與 Python 的 range() 內置函數相似，我們可以用 arange() 創建一個 NumPy 數組。

my_list = np.arange(10)#ORmy_list = np.arange(0,10)

這產生了 0~10 的十個數字。

要注意的是 arange() 函數中有三個參數。第三個參數表示步長。例如，要得到 0~10 中的偶數，只需要將步長設置為 2 就可以了，如下所示：

my_list = np.arange(0,11,2)

還可以創建有 7 個 0 的一維數組：

my_zeros = np.zeros(7)

也可以創建有 5 個 1 的一維數組：

my_ones = np.ones(5)

同樣，我們可以生成內容都為 0 的 3 行 5 列二維數組：

two_d = np.zeros((3,5))

使用 linspace() 內置函數創建 NumPy 數組

linspace() 函數返回的數字都具有指定的間隔。也就是說，如果我們想要 1 到 3 中間隔相等的 15 個點，我們只需使用以下命令：

lin_arr = np.linspace(1, 3, 15)

該命令可生成一維向量。

與 arange() 函數不同，linspace() 的第三個參數是要創建的數據點數量。

在 NumPy 中創建一個恆等矩陣

處理線性代數時，恆等矩陣是非常有用的。一般而言，恆等矩陣是一個二維方矩陣，也就是說在這個矩陣中列數與行數相等。有一點要注意的是，恆等矩陣的對角線都是 1，其他的都是 0。恆等矩陣一般只有一個參數，下述命令說明了要如何創建恆等矩陣：

my_matrx = np.eye(6) #6 is the number of columns/rows you want

用 NumPy 創建一個隨機數組成的數組

我們可以使用 rand()、randn() 或 randint() 函數生成一個隨機數組成的數組。

例如，如果想要一個由 4 個對象組成的一維數組，且這 4 個對象均勻分佈在 0~1，可以這樣做：

my_rand = np.random.rand(4)

如果我們想要一個有 5 行 4 列的二維數組，則：

my_rand = np.random.rand(5, 4)my_rand

my_randn = np.random.randn(7)my_randn

繪製結果後會得到一個正態分佈曲線。

同樣地，如需創建一個 3 行 5 列的二維數組，這樣做即可：

np.random.randn(3,5)

np.random.randint(20)

#generates a random integer exclusive of 20np.random.randint(2, 20)

#generates a random integer including 2 but excluding

20np.random.randint(2, 20, 7) #generates 7 random integers including 2

but excluding 20

將一維數組轉換成二維數組

先創建一個有 25 個隨機整數的一維數組：

arr = np.random.rand(25)

然後使用 reshape() 函數將其轉換為二維數組：

arr.reshape(5,5)

注意：reshape() 僅可轉換成行列數目相等，且行列數相乘後要與元素數量相等。上例中的 arr 包含 25 個元素，因此只能重塑為 5*5 的矩陣。

定位 NumPy 數組中的最大值和最小值

使用 max() 和 min() 函數，我們可以得到數組中的最大值或最小值：

arr_2

= np.random.randint(0, 20, 10)arr_2.max() #This gives the highest value

in the arrayarr_2.min() #This gives the lowest value in the array

使用 argmax() 和 argmin() 函數，我們可以定位數組中最大值和最小值的索引：

arr_2.argmax()

#This shows the index of the highest value in the array arr_2.argmin()

#This shows the index of the lowest value in the array

假設存在大量數組，而你需要弄清楚數組的形態，你想知道這個數組是一維數組還是二維數組，只需要使用 shape 函數即可：

arr.shape

從 NumPy 數組中索引／選擇多個元素（組）

在 NumPy 數組中進行索引與 Python 類似，只需輸入想要的索引即可：

my_array = np.arange(0,11)my_array[8] #This gives us the value of element at index 8

為了獲得數組中的一系列值，我們可以使用切片符「:」，就像在 Python 中一樣：

my_array[2:6]

#This returns everything from index 2 to 6(exclusive)my_array[:6] #This

returns everything from index 0 to 6(exclusive)my_array[5:] #This

returns everything from index 5 to the end of the array.

類似地，我們也可以通過使用 [ ][ ] 或 [,null] 在二維數組中選擇元素。

使用 [ ][ ] 從下面的二維數組中抓取出值「60」：

two_d_arr

= np.array([[10,20,30], [40,50,60], [70,80,90]])two_d_arr[1][2] #The

value 60 appears is in row index 1, and column index 2

使用 [,null] 從上面的二維數組中抓取出值「20」：

two_d_arr[0,1]

也可以用切片符抓取二維數組的子部分。使用下面的操作從數組中抓取一些元素：

two_d_arr[:1,

:2] # This returns [[10, 20]]two_d_arr[:2, 1:] # This returns ([[20,

30], [50, 60]])two_d_arr[:2, :2] #This returns ([[10, 20], [40, 50]])

我們還可以索引一整行或一整列。只需使用索引數字即可抓取任意一行：

two_d_arr[0]

#This grabs row 0 of the array ([10, 20, 30])two_d_arr[:2] #This grabs

everything before row 2 ([[10, 20, 30], [40, 50, 60]])

還可以使用 &、|、和 == 運算符對數組執行條件選擇和邏輯選擇，從而對比數組中的值和給定值：

new_arr

= np.arange(5,15)new_arr > 10 #This returns TRUE where the elements

are greater than 10 [False, False, False, False, False, False, True,

True, True, True]

現在我們可以輸出符合上述條件的元素：

bool_arr = new_arr

> 10new_arr[bool_arr] #This returns elements greater than 10 [11, 12,

13, 14]new_arr[new_arr>10] #A shorter way to do what we have just

done

組合使用條件運算符和邏輯運算符，我們可以得到值大於 6 小於 10 的元素：

new_arr[(new_arr>6) & (new_arr

預期結果為：([7, 8, 9])

廣播機制

廣播機制是一種快速改變 NumPy 數組中的值的方式。

my_array[0:3] = 50#Result is:[50, 50, 50, 3, 4, 5, 6, 7, 8, 9, 10]

在這個例子中，我們將索引為 0 到 3 的元素的初始值改為 50。

對 NumPy 數組執行數學運算

arr

= np.arange(1,11)arr * arr #Multiplies each element by itselfarr – arr

#Subtracts each element from itselfarr + arr #Adds each element to

itselfarr / arr #Divides each element by itself

我們還可以對數組執行標量運算，NumPy 通過廣播機制使其成為可能：

arr + 50 #This adds 50 to every element in that array

NumPy 還允許在數組上執行通用函數，如平方根函數、指數函數和三角函數等。

np.sqrt(arr)

#Returns the square root of each elementnp.exp(arr) #Returns the

exponentials of each elementnp.sin(arr) #Returns the sin of each

elementnp.cos(arr) #Returns the cosine of each elementnp.log(arr)

#Returns the logarithm of each elementnp.sum(arr) #Returns the sum total

of elements in the arraynp.std(arr) #Returns the standard deviation of

in the array

我們還可以在二維數組中抓取行或列的總和：

mat =

np.arange(1,26).reshape(5,5)mat.sum() #Returns the sum of all the values

in matmat.sum(axis=0) #Returns the sum of all the columns in

matmat.sum(axis=1) #Returns the sum of all the rows in mat

現在，這篇 NumPy 教程進入了尾聲！希望對大家有所幫助。

本文僅代表作者觀點，不代表百度立場。
本文係作者授權百度百家發表，未經許可，不得轉載。

http://www.kubonews.com/2018042114195.html

生活苦悶?快上酷播亮新聞：http://www.kubonews.com

超級商城每日周報

2018年4月21日星期六

入門｜數據科學初學者必知的NumPy基礎知識

沒有留言:

張貼留言

2018年4月21日 星期六

入門｜數據科學初學者必知的NumPy基礎知識

沒有留言:

張貼留言

2018年4月21日星期六