ClickHouse之简单性能测试
前面的文章ClickHouse之初步認(rèn)識已經(jīng)簡單的介紹了ClickHouse,接下來進(jìn)行簡單的性能測試。測試數(shù)據(jù)來源于美國民用航班的數(shù)據(jù),從1987年到2017年,有1.7億條。
環(huán)境:
centos 6.3,32G內(nèi)存,24核
下載腳本:
#!/bin/bashfor s in `seq 1987 2017` do for m in `seq 1 12` do mwget -n 128 http://transtats.bts.gov/PREZIP/On_Time_On_Time_Performance_${s}_${m}.zip done done這里采用的是mwget,128個線程,mwget是wget的多線程版本。wget太慢了。關(guān)于mwget的安裝,請參考:https://my.oschina.net/766/blog/156807
下載以后的數(shù)據(jù)是zip壓縮包,如下:
下載完數(shù)據(jù)以后建表:
客戶端登錄:?clickhouse-client -m,如果不加-m啟用多行,那么將會報錯:
CREATE TABLE ontime (Year UInt16, Quarter UInt8, Month UInt8, DayofMonth UInt8, DayOfWeek UInt8, FlightDate Date, UniqueCarrier FixedString(7), AirlineID Int32, Carrier FixedString(2), TailNum String, FlightNum String, OriginAirportID Int32, OriginAirportSeqID Int32, OriginCityMarketID Int32, Origin FixedString(5), OriginCityName String, OriginState FixedString(2), OriginStateFips String, OriginStateName String, OriginWac Int32, DestAirportID Int32, DestAirportSeqID Int32, DestCityMarketID Int32, Dest FixedString(5), DestCityName String, DestState FixedString(2), DestStateFips String, DestStateName String, DestWac Int32, CRSDepTime Int32, DepTime Int32, DepDelay Int32, DepDelayMinutes Int32, DepDel15 Int32, DepartureDelayGroups String, DepTimeBlk String, TaxiOut Int32, WheelsOff Int32, WheelsOn Int32, TaxiIn Int32, CRSArrTime Int32, ArrTime Int32, ArrDelay Int32, ArrDelayMinutes Int32, ArrDel15 Int32, ArrivalDelayGroups Int32, ArrTimeBlk String, Cancelled UInt8, CancellationCode FixedString(1), Diverted UInt8, CRSElapsedTime Int32, ActualElapsedTime Int32, AirTime Int32, Flights Int32, Distance Int32, DistanceGroup UInt8, CarrierDelay Int32, WeatherDelay Int32, NASDelay Int32, SecurityDelay Int32, LateAircraftDelay Int32, FirstDepTime String, TotalAddGTime String, LongestAddGTime String, DivAirportLandings String, DivReachedDest String, DivActualElapsedTime String, DivArrDelay String, DivDistance String, Div1Airport String, Div1AirportID Int32, Div1AirportSeqID Int32, Div1WheelsOn String, Div1TotalGTime String, Div1LongestGTime String, Div1WheelsOff String, Div1TailNum String, Div2Airport String, Div2AirportID Int32, Div2AirportSeqID Int32, Div2WheelsOn String, Div2TotalGTime String, Div2LongestGTime String, Div2WheelsOff String, Div2TailNum String, Div3Airport String, Div3AirportID Int32, Div3AirportSeqID Int32, Div3WheelsOn String, Div3TotalGTime String, Div3LongestGTime String, Div3WheelsOff String, Div3TailNum String, Div4Airport String, Div4AirportID Int32, Div4AirportSeqID Int32, Div4WheelsOn String, Div4TotalGTime String, Div4LongestGTime String, Div4WheelsOff String, Div4TailNum String, Div5Airport String, Div5AirportID Int32, Div5AirportSeqID Int32, Div5WheelsOn String, Div5TotalGTime String, Div5LongestGTime String, Div5WheelsOff String, Div5TailNum String ) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192) View Code導(dǎo)入數(shù)據(jù):
for i in *.zip; do echo $i; unzip -cq $i '*.csv' | sed 's/\.00//g' | clickhouse-client --query="INSERT INTO ontime FORMAT CSVWithNames"; done開始查詢測試:
可以看見1.7億數(shù)據(jù),count用了0.034秒,當(dāng)然列存儲數(shù)據(jù)庫count都不快還搞毛。
繼續(xù)測試其他的語句
從2000年到2016年每天的航班統(tǒng)計
SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year >= 2000 AND Year <= 2016 GROUP BY DayOfWeek ORDER BY c DESC;2000 - 2008年度機(jī)場延誤數(shù)
SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10這些查詢都有一個范圍限制,那么全部查完呢?
比如:
SELECT OriginCityName, DestCityName, count() AS c FROM ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;可以看見依然快的不像話,哈哈。心動了沒?心動了就動手安裝,導(dǎo)入數(shù)據(jù)測試一下吧。
?
參考資料:
https://raw.githubusercontent.com/yandex/ClickHouse/master/doc/example_datasets/1_ontime.txt
?
轉(zhuǎn)載于:https://www.cnblogs.com/gomysql/p/6655553.html
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎勵來咯,堅持創(chuàng)作打卡瓜分現(xiàn)金大獎總結(jié)
以上是生活随笔為你收集整理的ClickHouse之简单性能测试的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 打印机多张双面打印使用说明
- 下一篇: Spring Data JPA框架