當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

从0到1建立一张评分卡之模型建立

發布時間：2025/3/21 编程问答 24 豆豆

生活随笔收集整理的這篇文章主要介紹了从0到1建立一张评分卡之模型建立小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

?　評分卡建模常用邏輯回歸模型，將邏輯回歸輸出的概率值映射成分數，最后得到標準評分卡。關于評分卡映射的邏輯，可以看之前的文章邏輯回歸評分卡映射邏輯。
?　下面接著分箱之后的數據開始建模。首先將每個變量分箱的結果進行轉換，得到每個變量對應的woe結果表。

# 變量woe結果表 def woe_df_concat(bin_df):"""bin_df:list形式，里面存儲每個變量的分箱結果return :woe結果表"""woe_df_list =[]for df in bin_df:woe_df = df.reset_index().assign(col=df.index.name).rename(columns={df.index.name:'bin'})woe_df_list.append(woe_df)woe_result = pd.concat(woe_df_list,axis=0)# 為了便于查看，將字段名列移到第一列的位置上woe_result1 = woe_result['col']woe_result2 = woe_result.iloc[:,:-1]woe_result_df = pd.concat([woe_result1,woe_result2],axis=1)woe_result_df = woe_result_df.reset_index(drop=True)return woe_result_df df_woe_cat=woe_df_concat(bin_df_cat) df_woe_num=woe_df_concat(bin_df_num) df_woe=pd.concat([df_woe_cat,df_woe_num],axis=0)

?　這一步的目的是將變量分箱之后的結果進行整理匯總，看一下變量分箱之后的大致情況。

?　整理之后可以看到每個變量的分箱情況、每一箱的好壞占比、WOE、IV值。注意，最好每一箱的WOE值不要超過1。附上檢驗WOE值是否大于1的代碼。

# 檢查某個區間的woe是否大于1 def woe_large(bin_df):"""bin_df:list形式，里面存儲每個變量的分箱結果return:woe_large_col: 某個區間woe大于1的變量，list集合woe_judge_df :df形式，每個變量的檢驗結果"""woe_large_col=[]col_list =[]woe_judge =[]for woe_df in bin_df:col_name = woe_df.index.namewoe_list = list(woe_df.woe)woe_large = list(filter(lambda x:x>=1,woe_list))if len(woe_large)>0:col_list.append(col_name)woe_judge.append('True')woe_large_col.append(col_name)else:col_list.append(col_name)woe_judge.append('False')woe_judge_df = pd.DataFrame({'col':col_list,'judge_large':woe_judge})return woe_large_col,woe_judge_df

?　接著就是將變量的值映射為變量的WOE值準備入模。

# woe轉換 def woe_transform(df,target,df_woe):"""df:數據集target:目標變量的字段名df_woe:woe結果表return:woe轉化之后的數據集"""df2 = df.copy()for col in df2.drop([target],axis=1).columns:x = df2[col]bin_map = df_woe[df_woe.col==col]bin_res = np.array([0]*x.shape[0],dtype=float)for i in bin_map.index:lower = bin_map['min_bin'][i]upper = bin_map['max_bin'][i]if lower == upper:x1 = x[np.where(x == lower)[0]]else:x1 = x[np.where((x>=lower)&(x<=upper))[0]]mask = np.in1d(x,x1)bin_res[mask] = bin_map['woe'][i]bin_res = pd.Series(bin_res,index=x.index)bin_res.name = x.namedf2[col] = bin_resreturn df2

?　以上是轉化后的數據，所有變量的值都轉化成了對應的WOE值，后面就是進行建模。

feature_list=num_features+cat_features x = df_train[feature_list] y = df_train['y']lr_model = LogisticRegression(C=0.1) lr_model.fit(x,y) df_train['prob'] = lr_model.predict_proba(x)[:,1]

?　建模的代碼很簡單，以上幾行就完成邏輯回歸建模的過程了。然后進行評分映射。以下代碼需要理解評分卡分數映射的邏輯。

# 評分卡刻度 def cal_scale(score,odds,PDO,model):"""odds：設定的壞好比score:在這個odds下的分數PDO: 好壞翻倍比model:邏輯回歸模型return :A,B,base_score"""B = PDO/np.log(2)A = score+B*np.log(odds)# base_score = A+B*model.intercept_[0]print('B: {:.2f}'.format(B))print('A: {:.2f}'.format(A))# print('基礎分為：{:.2f}'.format(base_score))return A,B cal_scale(50,0.05,10,lr_model)

?　假定在5%的好壞比之下的分值為50分，PDO為10分，計算A和B兩個參數。A為14.43，B為6.78。

def Prob2Score(prob, A,B):#將概率轉化成分數且為正整數y = np.log(prob/(1-prob))return float(A-B*y) df_train['score'] = df_train['prob'].map(lambda x:Prob2Score(x,6.78,14.43))

?　分數的計算公式即A-B*log(odds)。到此評分卡就建立完成了，后續還需要對模型進行評估、對評分卡的分數進行監控，在之前的文章中已經寫過。

風控模型評估

評分卡實現和評估

?　評分卡系列是我自己從網上找的代碼和數據集，一點點實現了一遍，記錄一下自己學習的過程。后續會找一份機器學習模型的代碼練習。作為初學者，本文理解尚有不到位之處，歡迎大家多多指正。量化成長軌跡，共同交流與成長。

【作者】：Labryant
【原創公眾號】：風控獵人
【簡介】：某創業公司策略分析師，積極上進，努力提升。乾坤未定，你我都是黑馬。
【轉載說明】：轉載請說明出處，謝謝合作！~

總結

以上是生活随笔為你收集整理的从0到1建立一张评分卡之模型建立的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

3atv精品不卡视频,97人人超碰国产精品最新,中文字幕av一区二区三区人妻少妇,久久久精品波多野结衣,日韩一区二区三区精品

编程问答

从0到1建立一张评分卡之模型建立

總結