面试题目_数据分析SQL面试题目9套汇总
金九銀十又是找工作的好季節啊,最近不少數據蛙數據分析社群同學,問到一些面試數據分析時的SQL題目,所以就結合大家的面試題目以及工作場景給大家總結了一些經典題目。同時也特別感謝001號同學和002號同學的幫忙整理
面試題目一
1.題目如下
2.下面開始建表、插入數據
create table datafrog_test1 (userid varchar(20), changjing varchar(20), inttime varchar(20) );insert into datafrog_test1 values (1,1001,1400), (2,1002,1401), (1,1002,1402), (1,1001,1402), (2,1003,1403), (2,1004,1404), (3,1003,1400) (4,1004,1402), (4,1003,1403), (4,1001,1403), (4,1002,1404) (5,1002,1402), (5,1002,1403), (5,1001,1404), (5,1003,1405);3.解答思路:排序及concat連接
select concat(t.userid,'-',group_concat(t.changjing separator'-')) as result from(select userid,changjing,inttime,if(@tmp=userid,@rank:=@rank+1,@rank:=1) as new_rank,@tmp:=userid as tmpfrom (select userid,changjing, min(inttime) inttime from datafrog_test1 group by userid,changjing)temporder by userid,inttime)t where t.new_rank<=2 group by t.userid;4.輸出結果:
5.注意:
有可能大家的代碼會有報錯現象,主要是ONLY_FULL_GROUP_BY引起的報錯,解決辦法是運行上面代碼時,運行下這個就好set sql_mode='' 。其實mysql 作為查詢還是不錯的,但是拿來做分析的話,就是有點乏力了,像排序、中位數等都比較麻煩些的,工作中一般用pandas、sqlserver、oracle、hive、spark這些來做分析。這里可不是說mysql沒用,反而是特別有用,也容易上手,是其他的基礎。
6.大家來看下hive解法
with tmp as ( select userid,changjing,order_num,changjing1 from (SELECT userid ,changjing,row_number() over(partition by userid order by inttime asc) as order_num,lag(changjing,1,'datafrog') OVER(partition by userid order by inttime asc) AS changjing1FROM datafrog_test1) as a where changjing!=changjing1) , tmp2 as ( select userid,changjing,order_num,changjing1, row_number() over(partition by userid order by order_num ) as changjing_num from tmp ) select concat( userid,'-',concat_ws('-', collect_set(changjing)) ) from tmp2 where changjing_num <3 group by userid面試題目二
1.題目如下
2.下面開始建表、插入數據
create database xiangji; use xiangji; create table userinfo( uid varchar(10), app_name varchar(20), duration int(10), times int(10), dayno varchar(30) );load data infile 'D:/d.csv' into table userinfo fields terminated by ',' ignore 1 lines;咱們數據蛙小伙伴給大家編輯了一份數據,大家文末查看喲3.先看看活躍度的計算
select dayno, count(distinct uid) as 活躍度 from aui where app_name='相機' group by dayno ;4.次日留存
使用兩表自交,利用case when找到符合相差日期為1天的id,計數,得出次日留存人數,最后用distinct去重
5.計算次日、三日、七日留存
selectday1,count(distinct a.uid) 活躍,count(distinct case when day2-day1=1 then a.uid end) 次留,count(distinct case when day2-day1=3 then a.uid end) 三留,count(distinct case when day2-day1=7 then a.uid end) 七留,concat(count(distinct case when day2-day1=1 then a.uid end)/count(distinct a.uid)*100,'%') 次日留存率,concat(count(distinct case when day2-day1=3 then a.uid end)/count(distinct a.uid)*100,'%') 三日留存率,concat(count(distinct case when day2-day1=7 then a.uid end)/count(distinct a.uid)*100,'%') 七日留存率 from (select uid,date_format(dayno,'%Y%m%d') day1 from aui where app_name = '相機') a left join (select uid,date_format(dayno,'%Y%m%d') day2 from aui where app_name = '相機') b on a.uid=b.uid group by day1;知識點:date_format用法,sum()結合條件語句的用法,百分比符號concat添加
面試題目三
1.行轉列(圖中左變右)
1.行轉列(圖中左變右)
-- 創建 course 表 create table course ( id varchar(20), teacher_id varchar(20), week_day varchar(20), has_course varchar(20) ); insert into course value (1,1,2,"Yes"), (2,1,3,"Yes"), (3,2,1,"Yes"), (4,3,2,"Yes"), (5,1,2,"Yes") ; select * from course; -- 行轉列 select id,teacher_id, (case when week_day = 1 then "Yes" else " " end) "mon", (case when week_day = 2 then "Yes" else " " end) "tue", (case when week_day = 3 then "Yes" else " " end) "thi", (case when week_day = 4 then "Yes" else " " end) "thu", (case when week_day = 5 then "Yes" else " " end) "fri"from course;方法二: select id,teacher_id, (case week_day when 1 then "Yes" else " " end) "mon", (case week_day when 2 then "Yes" else " " end) "tue", (case week_day when 3 then "Yes" else " " end) "thi", (case week_day when 4 then "Yes" else " " end) "thu", (case week_day when 5 then "Yes" else " " end) "fri"from course;知識點:case when的用法
4.結果如下:
面試題目四
1.問題
2.數據導入
create table a1 ( name varchar(20), english int, maths int, music int); insert into a1 values ("Jim",90,88,99);3.解答如下
SELECT name, "english" AS subject, english AS score FROM a1 UNION SELECT name, "maths" AS subject, maths AS score FROM a1 UNION SELECT name, "music" AS subject, music AS score FROM a1 ORDER BY name;這個題目還有其他方法,大家思考下,主要是考察行轉列的
面試題目五
1.問題
2.建表導入數據
create table A2 ( FDATE datetime, value int ); insert into a2 values ("2018/11/23",10), ("2018/12/31",3), ("2019/2/9",53), ("2019/3/31",23), ("2019/7/8",11), ("2019/7/31",10); # delete from a2 where value = 10; select * from a2;3.解答第一問
-- 添加索引 給 FDATE; create index id_FDATE on a2(FDATE); show index from a2;4.解答第二問
SELECT FYEAR, FMONTH, VALU AS VALUE, YSUM, SUM FROM (SELECT b1.*, @YSUM := if(@year = fyear, @YSUM + valu, valu) AS YSUM, @sum := @sum + valu AS SUM, @year := fyearFROM (SELECT year(FDATE) AS FYEAR, month(Fdate) AS FMONTH, SUM(VALUE) AS valuFROM A2GROUP BY FMONTHORDER BY FYEAR, FMONTH) b1, (SELECT @sum := 0, @YSUM := 0, @YEAR := NULL) b2 ) B;面試題目六
1.問題
2.建表導入數據
create table userlog (id int ,name varchar(10),EmailAddress varchar(50),lastlogon varchar(50) ) insert into userlog values(100,'test4','test4@yahoo.cn','2007-11-25 16:31:26'); insert into userlog values(13,'test1','test4@yahoo.cn','2007-3-22 16:27:07'); insert into userlog values(19,'test1','test4@yahoo.cn','2007-10-25 14:13:46'); insert into userlog values(42,'test1','test4@yahoo.cn','2007-10-25 14:20:10'); insert into userlog values(45,'test2','test4@yahoo.cn','2007-4-25 14:17:39'); insert into userlog values(49,'test2','test4@yahoo.cn','2007-5-25 14:22:36');3.1解答第一問,方法1
select user2.name,user2.EmailAddress,count(user2.day1) ,max(user2.time2)from (select * ,max(time1) time2 from (select name,EmailAddress,date_format(lastlogon, '%Y-%m-%d %H:%i:%s') as time1,date_format(substring_index(lastlogon,' ',1),'%Y-%m-%d') as day1from userlog order by name asc ,time1 asc)as user1group by user1.name,user1.day1) user2group by user2.name3.2解答第一問,方法2
select t1.name,max(t1.lastlogon),max(t1.days),count(distinct days) from(selectid,name,emailAddress,date_format(lastlogon,'%Y-%m-%d %H:%i:%s') as lastlogon,date_format(lastlogon,'%Y-%m-%d') as days from data.userlog as u ) as t1 group by t1.name4.解答第二問
DROP TABLE IF EXISTS tmp_table; CREATE TEMPORARY TABLE tmp_table select user2.name as Name ,user2.time1 as lastlogon ,user2.rank1 as num_logontime,user2.rank2 as num_logonday from (select * ,@rank:=if(@nam=user1.name,@rank+1,1) rank1,@rank1:=if(@nam=user1.name,if(@da=user1.day1,@rank1,@rank1+1),1) rank2,@nam:=user1.name,@da:=user1.day1from (select name,EmailAddress,date_format(lastlogon, '%Y-%m-%d %H:%i:%s') as time1,date_format(substring_index(lastlogon,' ',1),'%Y-%m-%d') as day1from userlog order by name asc ,time1 asc)as user1 ,(select @rank:=0,@rank1:=0,@nam:=null,@da:=null) temp order by user1.name,user1.time1) as user2 ; select * from tmp_table面試題目七
1.問題
2.建表導入數據
create table tableA (qq int(20), game varchar(20)); insert into tableA values (10000,"a"), (10000,"b"), (10000,"c"), (20000,"c"), (20000,"d");3.解答第一問
drop table if exists tableB;create TEMPORARY table tableB ( select qq,group_concat(game separator"-") as game from tableA group by qq);select * from tableB;4.1解答第二問,substring_index的使用
select qq,game, substring_index(game,"-",0) a0, substring_index(game,"-",1) a1, substring_index(game,"-",2) a2, substring_index(game,"-",3) a3, substring_index(game,"-",-1) "a-1", substring_index(substring_index(game,"-",2),"-",-1) "a(a-1)" from tableB;4.2解答第二問,mysql.help_topic中的help_topic_id從0遞增
select * from mysql.help_topic;**4.3解答第二問,解答思路 **
利用內置表模mysql.help_topic 進行左連,實現 help_topic_id的自增;利用length限制長度,尋找“-”的個數范圍;
select a.*, help_topic_id as id, substring_index(game,"-",help_topic_id+1) as "sbi(id+1)", substring_index(substring_index(game,"-",help_topic_id+1),"-",-1) as "sbi(sbi(id+1),-1)", length(game) as 全長, length(replace(game,"-","")) as 除去符號長度, length(game)-length(replace(game,"-",""))+1 as "逗號個數+1" from tableB a left join mysql.help_topic as b on help_topic_id < (length(game)-length(replace(game,"-",""))+1);**4.4解答第二問,解答思路 **
select qq, substring_index(substring_index(game,"-",help_topic_id+1),"-",-1) as game from tableB a left join mysql.help_topic as b on help_topic_id < (length(game)-length(replace(game,"-",""))+1);面試題目八
感謝小王子同學的整理
1.問題
1.1防止大家看不清晰,給大家羅列出來
1、計算2019年6月1日至今,每日DAU(活躍用戶量,即有登陸的用戶) 2、計算20190601至今,每日領取紅包的新用戶數,老用戶數,及人均領取金額,人均領取次數 3、計算2019年3月,每個月按領紅包取天數為1、2、3……30、31天區分,計算取每個月領取紅包的用戶數,人均領取金額,人均領取次數 4、計算2019年3月,每個月領過紅包用戶和未領紅包用戶的數量,平均月活躍天數(即本月平均活躍多少天) 5、計算2019年3月至今,每個月活躍用戶的注冊日期,2019年3月1日前注冊的用戶日期填空即可 6、計算2019年3月至今,每日的用戶次日留存率,領取紅包用戶的次日留存,未領取紅包用戶的次日留存率 7、計算2019年6月1日至今,每日新用戶領取得第一個紅包的金額 8.計算2019年3月1日至今,每個新用戶領取的第一個紅包和第二個紅包的時間差(只計算注冊當日有領取紅包的用戶,注冊當日及以后的DAU表中新用戶為1的用戶)2.建表導入數據
2.1用戶活躍模型表
create table tmp_liujg_dau_based( imp_date varchar(20) not null comment '日期', qimei varchar(20) not null comment '用戶唯一標識', is_new varchar(10) comment '新用戶表示,1表示新用戶,0表示老用戶', primary key(imp_date,qimei)); ALTER TABLE tmp_liujg_dau_based COMMENT '用戶活躍模型表';2.2紅包參與領取模型表
create table tmp_liujg_packed_based ( imp_date varchar(20) comment '日期', report_time varchar(20) comment '領取時間戳', qimei varchar(20) not null comment '用戶唯一標識', add_money varchar(20) not null comment '領取金額,單位為分'); ALTER TABLE tmp_liujg_packed_based COMMENT '紅包參與領取模型表';2.3用戶活躍模型表數據導入
insert into tmp_liujg_dau_based values('20190301','001','0') ;insert into tmp_liujg_dau_based values('20190301','002','0') ;insert into tmp_liujg_dau_based values('20190301','003','1') ;insert into tmp_liujg_dau_based values('20190301','004','1') ;insert into tmp_liujg_dau_based values('20190301','005','1') ;insert into tmp_liujg_dau_based values('20190301','006','1') ;insert into tmp_liujg_dau_based values('20190302','001','0') ;insert into tmp_liujg_dau_based values('20190302','002','0') ;insert into tmp_liujg_dau_based values('20190302','003','0') ;insert into tmp_liujg_dau_based values('20190302','005','0') ;insert into tmp_liujg_dau_based values('20190302','006','0') ;insert into tmp_liujg_dau_based values('20190302','007','1') ;insert into tmp_liujg_dau_based values('20190303','005','0') ;insert into tmp_liujg_dau_based values('20190303','006','0') ;insert into tmp_liujg_dau_based values('20190303','007','0') ;insert into tmp_liujg_dau_based values('20190303','008','1') ;insert into tmp_liujg_dau_based values('20190303','009','1') ;insert into tmp_liujg_dau_based values('20190303','010','1') ;insert into tmp_liujg_dau_based values('20190401','008','0') ;insert into tmp_liujg_dau_based values('20190401','009','0') ;insert into tmp_liujg_dau_based values('20190401','010','0') ;insert into tmp_liujg_dau_based values('20190401','011','1') ;insert into tmp_liujg_dau_based values('20190401','012','1') ;insert into tmp_liujg_dau_based values('20190402','009','0') ;insert into tmp_liujg_dau_based values('20190402','010','0') ;insert into tmp_liujg_dau_based values('20190402','011','0') ;insert into tmp_liujg_dau_based values('20190402','012','0') ;insert into tmp_liujg_dau_based values('20190402','013','1') ;insert into tmp_liujg_dau_based values('20190402','014','1') ;insert into tmp_liujg_dau_based values('20190501','001','0') ;insert into tmp_liujg_dau_based values('20190501','002','0') ;insert into tmp_liujg_dau_based values('20190501','008','0') ;insert into tmp_liujg_dau_based values('20190501','007','0') ;insert into tmp_liujg_dau_based values('20190501','015','1') ;insert into tmp_liujg_dau_based values('20190501','016','1') ;insert into tmp_liujg_dau_based values('20190501','017','1') ;insert into tmp_liujg_dau_based values('20190501','018','1') ;insert into tmp_liujg_dau_based values('20190601','008','0') ;insert into tmp_liujg_dau_based values('20190601','017','0') ;insert into tmp_liujg_dau_based values('20190601','018','0') ;insert into tmp_liujg_dau_based values('20190601','019','1') ;insert into tmp_liujg_dau_based values('20190601','020','1') ;insert into tmp_liujg_dau_based values('20190601','021','1') ;insert into tmp_liujg_dau_based values('20190601','022','1') ;insert into tmp_liujg_dau_based values('20190603','021','0') ;insert into tmp_liujg_dau_based values('20190603','022','0') ;insert into tmp_liujg_dau_based values('20190603','011','0') ;insert into tmp_liujg_dau_based values('20190603','012','0') ;insert into tmp_liujg_dau_based values('20190603','023','1') ;insert into tmp_liujg_dau_based values('20190701','023','0') ;insert into tmp_liujg_dau_based values('20190701','008','0') ;insert into tmp_liujg_dau_based values('20190701','011','0') ;insert into tmp_liujg_dau_based values('20190701','022','0') ;insert into tmp_liujg_dau_based values('20190701','012','0') ;insert into tmp_liujg_dau_based values('20190701','024','1') ;insert into tmp_liujg_dau_based values('20190701','025','1') ;insert into tmp_liujg_dau_based values('20190701','026','1') ;insert into tmp_liujg_dau_based values('20190701','027','1') ;insert into tmp_liujg_dau_based values('20190705','026','0') ;insert into tmp_liujg_dau_based values('20190705','027','0') ;insert into tmp_liujg_dau_based values('20190705','009','0') ;insert into tmp_liujg_dau_based values('20190705','010','0') ;insert into tmp_liujg_dau_based values('20190705','028','1') ;insert into tmp_liujg_dau_based values('20190705','029','1') ;2.2紅包參與領取模型表數據導入
insert into tmp_liujg_packed_based values('20190301','2019/03/01 10:15:01','001','1.05') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 13:15:01','001','2.30') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 11:15:01','002','0.80') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 17:15:01','002','0.89') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 14:15:01','003','2.12') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 18:15:01','003','1.12') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 17:15:01','005','1.12') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 19:15:01','005','0.12') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 09:15:01','006','0.98') ;insert into tmp_liujg_packed_based values('20190301','2019/03/01 11:15:01','006','1.45') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 09:30:01','001','0.78') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 19:30:01','001','0.88') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 10:30:01','003','0.68') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 09:30:01','005','1.01') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 14:30:01','005','1.88') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 14:30:01','006','1.88') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 10:30:01','006','0.68') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 15:30:01','007','0.68') ;insert into tmp_liujg_packed_based values('20190302','2019/03/02 16:30:01','007','1.78') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 16:30:01','005','0.68') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 08:50:01','006','0.32') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 10:50:01','006','1.78') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 10:50:01','007','0.32') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 15:50:01','008','1.01') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 17:50:01','008','1.68') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 10:30:01','010','1.88') ;insert into tmp_liujg_packed_based values('20190303','2019/03/03 15:50:01','010','0.32') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 09:50:00','008','0.18') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 11:50:00','009','0.88') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 12:35:00','009','0.32') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 17:50:00','010','1.01') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 12:35:00','011','1.68') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 12:50:00','011','0.88') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 15:50:00','012','0.32') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 17:50:00','012','1.68') ;insert into tmp_liujg_packed_based values('20190401','2019/04/01 12:35:00','012','1.88') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 09:50:00','009','0.18') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 19:50:00','009','1.18') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 17:50:00','010','0.88') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 16:50:00','010','0.32') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 19:50:00','010','0.32') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 16:50:00','013','0.88') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 19:55:00','013','0.88') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 20:55:00','013','1.01') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 19:55:00','014','0.32') ;insert into tmp_liujg_packed_based values('20190402','2019/04/02 20:55:00','014','1.01') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 09:50:00','001','1.18') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 09:55:00','002','0.32') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 10:55:00','002','0.32') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 10:00:00','007','0.88') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 10:55:00','015','0.88') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 10:00:00','015','0.32') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 14:00:00','017','1.01') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 15:00:00','017','1.01') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 15:30:00','018','0.88') ;insert into tmp_liujg_packed_based values('20190501','2019/05/01 16:30:00','018','0.68') ;insert into tmp_liujg_packed_based values('20190601','2019/06/01 09:50:00','008','1.38') ;insert into tmp_liujg_packed_based values('20190601','2019/06/01 14:50:00','017','0.88') ;insert into tmp_liujg_packed_based values('20190601','2019/06/01 11:50:00','019','1.01') ;insert into tmp_liujg_packed_based values('20190601','2019/06/01 13:50:00','019','0.88') ;insert into tmp_liujg_packed_based values('20190601','2019/06/01 14:50:00','019','0.68') ;insert into tmp_liujg_packed_based values('20190602','2019/06/02 09:50:00','021','0.38') ;insert into tmp_liujg_packed_based values('20190602','2019/06/02 13:50:00','012','0.88') ;insert into tmp_liujg_packed_based values('20190602','2019/06/02 15:50:00','012','1.12') ;insert into tmp_liujg_packed_based values('20190602','2019/06/02 13:59:00','023','0.88') ;insert into tmp_liujg_packed_based values('20190602','2019/06/02 15:50:00','023','1.01') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 09:50:00','023','0.38') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 13:50:00','023','0.78') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 14:50:00','008','0.68') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 14:50:00','024','0.68') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 15:50:00','024','1.68') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 14:50:00','026','0.68') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 18:50:00','026','1.68') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 14:50:00','027','0.68') ;insert into tmp_liujg_packed_based values('20190701','2019/07/01 19:35:00','027','1.11') ;insert into tmp_liujg_packed_based values('20190702','2019/07/02 09:50:00','026','0.78') ;insert into tmp_liujg_packed_based values('20190702','2019/07/02 11:50:00','026','0.78') ;insert into tmp_liujg_packed_based values('20190702','2019/07/02 13:50:00','028','1.01') ;insert into tmp_liujg_packed_based values('20190702','2019/07/02 14:35:00','028','0.88') ;insert into tmp_liujg_packed_based values('20190702','2019/07/02 15:35:00','028','0.33')3.解答第一問
計算2019年6月1日至今,每日DAU(活躍用戶量,即有登陸的用戶)
4.解答第二問
計算20190601至今,每日領取紅包的新用戶數,老用戶數,及人均領取金額,人均領取次數
Select aa. imp_date, aa.is_new,count(distinct aa.qimei) 領取紅包人數, sum(aa.add_money)/count(distinct aa.qimei) 人均領取金額, count(aa.report_time)/count(distinct aa.qimei) 人均領取次數 from (Select a. imp_date,a.qimei,a.add_money,a.report_time, Case when b.is_new = 1 then '新用戶' when b.is_new = 0 then '老用戶' else '領取紅包但未登陸'end is_new from tmp_liujg_packed_based a Left join tmp_liujg_dau_based b on a.imp_date = b.imp_date and a.qimei = b.qimei where a.imp_date > '20190601')aa Group by aa.imp_date,aa.is_new5.解答第三問
計算2019年3月,每個月按領紅包取天數為1、2、3……30、31天區分,計算取每個月領取紅包的用戶數,人均領取金額,人均領取次數
6.解答第四問
計算2019年3月,每個月領過紅包用戶和未領紅包用戶的數量,平均月活躍天數(即本月平均活躍多少天)
Select left(cc.imp_date,6) 月份, cc.is_packet_user 紅包用戶, Count(distinct cc.qimei) 用戶數量, Count(is_packet_user)/Count(distinct cc.qimei) 月活躍天 from (Select a.imp_date, a.qimei,b.qimei hb_qimei, Case when b.qimei is not null then '紅包用戶' else '非紅包用戶' end is_packet_user, Case when b.qimei is not null then b.qimei else a.qimei end is_qimei from tmp_liujg_dau_based a Left join (select distinct left(imp_date,6) imp_date ,qimei from tmp_liujg_packed_based where imp_date >= '20190301' )b On left(a.imp_date,6) = b.imp_date and a.qimei = b.qimei)cc Group by left(cc.imp_date,6),cc.is_packet_user7.解答第五問
計算2019年3月至今,每個月活躍用戶的注冊日期,2019年3月1日前注冊的用戶日期填空即可
8.解答第六問
計算2019年6月至今,每日的用戶次日留存率,領取紅包用戶的次日留存,未領取紅包用戶的次日留存率
次日留存率:(當天新增的用戶中,在注冊的第2天還登錄的用戶數/第一天新增總用戶數)
9.解答第七問
計算2019年6月1日至今,每日新用戶領取得第一個紅包的金額
10.解答第八問
計算2019年3月1日至今,每個新用戶領取的第一個紅包和第二個紅包的時間差(只計算注冊當日有領取紅包的用戶,注冊當日及以后的DAU表中新用戶為1的用戶)
Select aaa.imp_date 注冊日期,aaa.qimei,aaa.report_time 新用戶第一次領取紅包的時間,bbb.第二次領紅包時間, TIMESTAMPDIFF(Minute,aaa.report_time,bbb.第二次領紅包時間) 第一次第二次紅包時間間隔_分鐘 from ( Select a.imp_date,a.qimei,b.report_time from (select * from tmp_liujg_dau_based where is_new = '1' and Imp_date >='20190301') a inner join (select * from tmp_liujg_packed_based where concat_ws('-',imp_date,qimei,report_time) in (select concat_ws('-',imp_date,qimei,min(report_time)) from tmp_liujg_packed_based group by imp_date,qimei))b on a.imp_date = b.imp_date and a.qimei = b.qimei )aaa Left join (Select min(bb.imp_date) imp_date ,--注冊日期 bb.qimei, min(report_time) 第二次領紅包時間 from (Select imp_date,qimei,report_time from tmp_liujg_packed_based where Imp_date >='20190301'-----篩選3月1日后新用戶領紅包的記錄----------- And qimei in (Select distinct qimei from tmp_liujg_dau_based where is_new = '1' and Imp_date >='20190301')--------排除3月1日新用戶第一次領取時間的記錄--------------- And concat_ws('-',imp_date,qimei,report_time) Not in (Select concat_ws('-',a.imp_date,a.qimei,b.report_time) from (select * from tmp_liujg_dau_based where is_new = '1' and Imp_date >='20190301') a inner join (select * from tmp_liujg_packed_based where concat_ws('-',imp_date,qimei,report_time) in (select concat_ws('-',imp_date,qimei,min(report_time)) from tmp_liujg_packed_based group by imp_date,qimei))b on a.imp_date = b.imp_date and a.qimei = b.qimei))bb group by bb.qimei)bbb On aaa.qimei = bbb.qimei and aaa.imp_date = bbb.imp_date待整理
八:SQL 統計連續出現的次數
九:hive sql題目
十:hive 四道面試題目
參考資料
1.數據分析面試之mysql的分組排序和變量賦值順序
2第二題留存數據集
總結
以上是生活随笔為你收集整理的面试题目_数据分析SQL面试题目9套汇总的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: CentOS---网络配置详解
- 下一篇: 华为云设计语言_华为又一项黑科技即将来临