机器学习 数据模型_使用PyCaret将机器学习模型运送到数据—第二部分
機器學習 數據模型
My previous post Machine Learning in SQL using PyCaret 1.0 provided details about integrating PyCaret with SQL Server. In this article, I will provide step-by-step details on how to train and deploy a Supervised Machine Learning Classification model in SQL Server using PyCaret 2.0 (PyCaret is a low-code ML library in Python).
我以前的文章使用PyCaret 1.0在SQL中進行機器學習提供了有關將PyCaret與SQL Server集成的詳細信息。 在本文中,我將提供有關如何使用PyCaret 2.0在SQL Server中訓練和部署監督機器學習分類模型的分步詳細信息。 (PyCaret是Python中的低代碼ML庫) 。
Things to be covered in this article:
本文涉及的內容:
1. How to load data into SQL Server table
1.如何將數據加載到SQL Server表中
2. How to create and save a model in SQL Server table
2.如何在SQL Server表中創建和保存模型
3. How to make model predictions using the saved model and store results in the table
3.如何使用保存的模型進行模型預測并將結果存儲在表中
I.導入/加載數據 (I. Import/Load Data)
You will now have to import CSV file into a database using SQL Server Management Studio.
現在,您將必須使用SQL Server Management Studio將CSV文件導入數據庫。
Create a table “cancer” in the database
在數據庫中創建一個表“ Cancer ”
Right-click the database and select Tasks -> Import Data
右鍵單擊數據庫,然后選擇任務 -> 導入數據
For Data Source, select Flat File Source. Then use the Browse button to select the CSV file. Spend some time configuring the data import before clicking the Next button.
對于數據源,選擇平面文件源 。 然后使用瀏覽按鈕選擇CSV文件。 在單擊“ 下一步”按鈕之前,請花一些時間配置數據導入。
For Destination, select the correct database provider (e.g. SQL Server Native Client 11.0). Enter the Server name; check Use SQL Server Authentication, enter the Username, Password, and Database before clicking the Next button.
對于“目標”,選擇正確的數據庫提供程序(例如SQL Server Native Client 11.0)。 輸入服務器名稱 ; 選中“ 使用SQL Server身份驗證” ,然后輸入“ 用戶名” ,“ 密碼 ”和“ 數據庫”,然后單擊“ 下一步”按鈕。
In the Select Source Tables and Views window, you can Edit Mappings before clicking the Next button.
在“選擇源表和視圖”窗口中,可以在單擊“ 下一步”之前編輯“映射” 。
Check Run immediately and click the Next button
選中立即運行,然后單擊下一步按鈕
Click the Finish button to run the package
單擊完成按鈕運行程序包
二。 創建ML模型并保存在數據庫表中 (II. Create ML Model & Save in Database Table)
Classification is a type of supervised machine learning to predict the categorical class labels which are discrete and unordered. The module available in the PyCaret package can be used for binary or multiclass problems.
分類是一種有監督的機器學習,用于預測離散且無序的分類類別標簽 。 PyCaret軟件包中提供的模塊可用于解決二進制或多類問題。
In this example, we will be using a ‘Breast Cancer Dataset’. Creating and saving a model in a database table is a multi-step process. Let’s go by them step by step:
在此示例中,我們將使用“ 乳腺癌數據集 ”。 在數據庫表中創建和保存模型是一個多步驟的過程。 讓我們一步一步地走:
i. Create a stored procedure to create a trained model in this case an Extra Trees Classifier algorithm. The procedure will read data from the cancer table created in the previous step.
一世。 在這種情況下,使用Extra Trees Classifier算法創建存儲過程以創建訓練模型。 該程序將從上一步創建的癌癥表中讀取數據。
Below is the code used to create the procedure:
以下是用于創建該過程的代碼:
-- Stored procedure that generates a PyCaret model using the cancer data using Extra Trees Classifier AlgorithmDROP PROCEDURE IF EXISTS generate_cancer_pycaret_model;GoCREATE PROCEDURE generate_cancer_pycaret_model (@trained_model varbinary(max) OUTPUT) ASBEGINEXECUTE sp_execute_external_script@language = N'Python', @script = N'import pycaret.classification as cpimport pickletrail1 = cp.setup(data = cancer_data, target = "Class", silent = True, n_jobs=None)# Create Modelet = cp.create_model("et", verbose=False)#To improve our model further, we can tune hyper-parameters using tune_model function.
#We can also optimize tuning based on an evaluation metric. As our choice of metric is F1-score, lets optimize our algorithm!tuned_et = cp.tune_model(et, optimize = "F1", verbose=False)#The finalize_model() function fits the model onto the complete dataset.
#The purpose of this function is to train the model on the complete dataset before it is deployed in productionfinal_model = cp.finalize_model(tuned_et)# Before saving the model to the DB table, convert it to a binary objecttrained_model = []
prep = cp.get_config("prep_pipe")
trained_model.append(prep)
trained_model.append(final_model)
trained_model = pickle.dumps(trained_model)', @input_data_1 = N'select "Class", "age", "menopause", "tumor_size", "inv_nodes", "node_caps", "deg_malig", "breast", "breast_quad", "irradiat" from dbo.cancer', @input_data_1_name = N'cancer_data', @params = N'@trained_model varbinary(max) OUTPUT', @trained_model = @trained_model OUTPUT;END;GO
ii. Create a table that is required to store the trained model object
ii。 創建存儲訓練的模型對象所需的表
DROP TABLE IF EXISTS dbo.pycaret_models;GOCREATE TABLE dbo.pycaret_models (model_id INT NOT NULL PRIMARY KEY,
dataset_name VARCHAR(100) NOT NULL DEFAULT('default dataset'),
model_name VARCHAR(100) NOT NULL DEFAULT('default model'),
model VARBINARY(MAX) NOT NULL
);GO
iii. Invoke stored procedure to create a model object and save into a database table
iii。 調用存儲過程以創建模型對象并保存到數據庫表中
DECLARE @model VARBINARY(MAX);EXECUTE generate_cancer_pycaret_model @model OUTPUT;
INSERT INTO pycaret_models (model_id, dataset_name, model_name, model) VALUES(2, 'cancer', 'Extra Trees Classifier algorithm', @model);
The output of this execution is:
該執行的輸出為:
Output from Console控制臺輸出The view of table results after saving model
保存模型后的表結果視圖
SQL Server Table ResultsSQL Server表結果三, 運行預測 (III. Running Predictions)
The next step is to run the prediction for the test dataset based on the saved model. This is again a multi-step process. Let’s go through all the steps again.
下一步是根據保存的模型為測試數據集運行預測。 這又是一個多步驟的過程。 讓我們再次完成所有步驟。
i. Create a stored procedure that will use the test dataset to detect cancer for a test datapoint
一世。 創建一個存儲過程,該過程將使用測試數據集來檢測測試數據點的癌癥
Below is the code to create a database procedure:
下面是創建數據庫過程的代碼:
DROP PROCEDURE IF EXISTS pycaret_predict_cancer;GOCREATE PROCEDURE pycaret_predict_cancer (@id INT, @dataset varchar(100), @model varchar(100))
ASBEGINDECLARE @py_model varbinary(max) = (select modelfrom pycaret_modelswhere model_name = @modeland dataset_name = @datasetand model_id = @id);EXECUTE sp_execute_external_script@language = N'Python',@script = N'# Import the scikit-learn function to compute error.import pycaret.classification as cpimport picklecancer_model = pickle.loads(py_model)# Generate the predictions for the test set.predictions = cp.predict_model(cancer_model, data=cancer_score_data)OutputDataSet = predictionsprint(OutputDataSet)', @input_data_1 = N'select "Class", "age", "menopause", "tumor_size", "inv_nodes", "node_caps", "deg_malig", "breast", "breast_quad", "irradiat" from dbo.cancer', @input_data_1_name = N'cancer_score_data', @params = N'@py_model varbinary(max)', @py_model = @py_modelwith result sets (("Class" INT, "age" INT, "menopause" INT, "tumor_size" INT, "inv_nodes" INT,"node_caps" INT, "deg_malig" INT, "breast" INT, "breast_quad" INT,"irradiat" INT, "Class_Predict" INT, "Class_Score" float ));END;GO
ii. Create a table to save the predictions along with the dataset
ii。 創建一個表以將預測與數據集一起保存
DROP TABLE IF EXISTS [dbo].[pycaret_cancer_predictions];GOCREATE TABLE [dbo].[pycaret_cancer_predictions]([Class_Actual] [nvarchar] (50) NULL,[age] [nvarchar] (50) NULL,[menopause] [nvarchar] (50) NULL,[tumor_size] [nvarchar] (50) NULL,[inv_nodes] [nvarchar] (50) NULL,[node_caps] [nvarchar] (50) NULL,[deg_malig] [nvarchar] (50) NULL,[breast] [nvarchar] (50) NULL,[breast_quad] [nvarchar] (50) NULL,[irradiat] [nvarchar] (50) NULL,[Class_Predicted] [nvarchar] (50) NULL,[Class_Score] [float] NULL) ON [PRIMARY]GOiii. Call pycaret_predict_cancer procedure to save predictions result into a table
iii。 調用pycaret_predict_cancer過程將預測結果保存到表中
--Insert the results of the predictions for test set into a tableINSERT INTO [pycaret_cancer_predictions]EXEC pycaret_predict_cancer 2, 'cancer', 'Extra Trees Classifier algorithm';iv. Execute the SQL below to view the result of the prediction
iv。 執行以下SQL以查看預測結果
-- Select contents of the tableSELECT * FROM [pycaret_cancer_predictions];Predictions Result預測結果IV。 結論 (IV. Conclusion)
In this post, we learnt how to build a classification model using a PyCaret in SQL Server. Similarly, you can build and run other types of supervised and unsupervised ML models depending on the need of your business problem.
在本文中,我們學習了如何在SQL Server中使用PyCaret構建分類模型。 同樣,您可以根據業務問題的需要來構建和運行其他類型的受監督和不受監督的ML模型。
Photo by Tobias Fischer on Unsplash Tobias Fischer在Unsplash上拍攝的照片You can further check out the PyCaret website for documentation on other supervised and unsupervised experiments that can be implemented in a similar manner within SQL Server.
您可以進一步訪問PyCaret網站,以獲取其他可以在SQL Server中以類似方式實施的有監督和無監督實驗的文檔。
My future posts will be tutorials on exploring other supervised & unsupervised learning techniques using Python and PyCaret within a SQL Server.
我未來的文章將是有關在S QL服務器中使用Python和PyCaret探索其他有監督和無監督學習技術的教程。
五,重要鏈接 (V. Important Links)
PyCaret
PyCaret
My LinkedIn Profile
我的LinkedIn個人資料
翻譯自: https://towardsdatascience.com/ship-ml-model-to-data-using-pycaret-part-ii-6a8b3f3d04d0
機器學習 數據模型
總結
以上是生活随笔為你收集整理的机器学习 数据模型_使用PyCaret将机器学习模型运送到数据—第二部分的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 需求分析与建模最佳实践_社交媒体和主题建
- 下一篇: 碎屏险有必要买吗