今日代码(200623)--回厂日期预测(python + R)
生活随笔
收集整理的這篇文章主要介紹了
今日代码(200623)--回厂日期预测(python + R)
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
代碼筆記,僅供參考
回廠日期預測
前言,對不同客戶的下一次返廠時間進行預測,大多數客戶的返廠次數不足10次,僅有少量客戶返廠次數大于30次。
平均值法預測(python)
# -*- coding: utf-8 -*-import pymysql import time import numpy as np import pandas as pdclass CarTest:def __init__(self):self.db = pymysql.connect(host = '127.0.0.1',port = 3306,user = 'root',password = '19970928',database = 'datacup',charset = 'utf8')self.cur = self.db.cursor()self.avertime = Noneself.datetframe = Noneself.preTime = {}self.Accuracy = 0def getDate(self):with open("./data/car_srv_train.csv", 'r') as f:data = f.readline()data = f.readline()print(type(data))while (data): # data = self.timeDeal(data)self.mysqlInteractive(data)data = f.readline()def timePre(self):dictTest = {}for key, value in self.dataframe:if key not in dictTest:dictTest[key] = [value]else:dictTest[key].append(value) # print(dictTest[key]) # print(dictTest)for key in dictTest:if len(dictTest[key]) > 5: # print(dictTest[key])sortTemp = dictTest[key]sortTemp.sort(reverse = True) # print(sortTemp) # print(type(sortTemp))averTimeInterval = (sortTemp[1] - sortTemp[-1])/(len(dictTest[key])-1)divTime = sortTemp[1] - sortTemp[2]preTime = (averTimeInterval + divTime)/2self.preTime[key] = [preTime]self.preTime[key].append(sortTemp[0] - sortTemp[1])self.preTime[key].append(sortTemp[1])self.preTime[key].append(sortTemp[0])#1:預測間隔時間,2:真實間隔時間,3:倒數第二天, 4:倒數第一天def errorTest(self):#設置預測時間在3個工作日內為預測正確trueNum = 0countNum = 0preDictTemp = self.preTimefor key in preDictTemp: #value為引用數據類型,可能不用key,value接收循環countNum +=1# print(preDictTemp[key][1]-preDictTemp[key][0]) # print(type(preDictTemp[key][1]-preDictTemp[key][0])) if np.abs((preDictTemp[key][1]-preDictTemp[key][0]).days) < 3:trueNum += 1self.Accuracy = trueNum/countNumdef mysqlInteractive(self, data):dataList = data.split(",")[0:3] # print(dataList)sqlLine = "insert into CarTest(ordername, username, ordertime) \values(%s, %s, %s);"try:self.cur.execute(sqlLine, dataList)self.db.commit()print("提交成功...") except Exception as e:self.db.rollback()print('錯誤信息:', e) def getMysqData(self):sqlLine = "select username, ordertime from CarTest \order by username,ordertime;" # sqlLine2 = """select username, DATEDIFF(max(ordertime), min(ordertime))/count(*) as avertime, count(*) as countnum from # cartest group by username HAVING COUNT(*)>=3;"""self.cur.execute(sqlLine)self.dataframe = self.cur.fetchall() # self.db.commit() # self.cur.execute(sqlLine2) # self.avertime = self.cur.fetchall() # self.db.commit()def main(self): # self.getDate()self.getMysqData()self.timePre()self.errorTest()self.cur.close()self.db.close()if __name__ == '__main__':#獲取開始的時間戳start = time.time()dataOutLine = CarTest()dataOutLine.main()print("準確率為:", dataOutLine.Accuracy)#獲取結束的時間戳end = time.time()print('執行時間:%.2f' % (end-start))
輸出:
準確率為: 0.016651248843663275 執行時間:23.67好吧,準確率低的可以。
時間序列預測(R語言)
library(forecast)#讀取數據getwd() setwd("C:/Users/goatbishop/Desktop/data") car_srv_train <- read.csv("car_srv_train.csv", header = T, stringsAsFactors = F)#簡單查看數據 head(car_srv_train) dim(car_srv_train)str(car_srv_train) car_srv_train$ORDERDATE <- as.Date(car_srv_train$ORDERDATE)test0624car <- car_srv_train head(diff(test0624car$ORDERDATE))test0624car <- test0624car[order(test0624car$CUST_ID, test0624car$ORDERDATE, decreasing = F), ] #head(test0624car[,c(1:4)])table(test0624car$CUST_ID) innames <- names(which(table(test0624car$CUST_ID) >= 3))test0624car3 <- test0624car[which(test0624car$CUST_ID %in% innames), ] tablenum <- table(test0624car3$CUST_ID) length(tablenum) newtest0624car <- test0624car3[-cumsum(tablenum), ] dim(test0624car3) dim(newtest0624car)test0624diff <- tapply(test0624car3$ORDERDATE, test0624car3$CUST_ID, diff) dfg <- unlist(test0624diff) dftest <- cbind(newtest0624car, dfg)head(dftest[, c(1:3, 13)], 10) write.csv(dftest, "dftest.csv")#####時間序列#####innames2 <- names(which(table(test0624car$CUST_ID) >= 30))Pre <- c() real <- c() for (item in c(1:length(innames2))) {tempdf <- dftest$dfg[which(dftest$CUST_ID == innames2[item])]temparima <- auto.arima(tempdf[-length(tempdf)])preout <-forecast(temparima, 1)real <- c(real, tempdf[length(tempdf)])Pre <- c(Pre, preout$mean)}errorPre <- real - Pre write.csv(Pre, "Pre.csv")###精確度###a1 = 0 b1 = 0 for (item in c(1:length(errorPre))) {b1 = b1 + 1if (abs(errorPre[item]) < 10) {a1 = a1 + 1} }(accer <- a1/b1)write.csv(errorPre, "errorPre.csv") write.csv(accer, "accer.csv")which(abs(errorPre) < 2)tempdf <- dftest$dfg[which(dftest$CUST_ID == innames2[4])] plot(tempdf, type = 'o', main = "時序圖")temparima <- auto.arima(tempdf[-length(tempdf)]) pretest <- forecast(temparima)acf(pretest$residuals) write.csv(pretest$residuals, "pretest_residuals.csv")
總結
以上是生活随笔為你收集整理的今日代码(200623)--回厂日期预测(python + R)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 魔兽世界怀旧服通灵学院有几个任务 通灵学
- 下一篇: 吞食孔明传东吴四将有哪些 东吴四将玩法攻