17-函数&复杂数据类型&Hive分区表

我们的原数据可能是json格式

通常我们的数据格式是text格式:如          id,username,age,gender

如果我们遇到是json格式    如          {"id":1,"username":"ruozedata","age",2,"gender":"unknown"}

如下图:

 

 我们的步骤是

json data ==> hive table ==> sql 

上传rating.json文件到/home/hadoop中

create table rating_json(json string);


load data local inpath "/home/hadoop/rating.json" into table rating_json;

select * from  rating_json limit 10;

select json_tuple(json,"movie","rate","time","userid") from rating_json limit 10;

 

select json_tuple(json,"movie","rate","time","userid") as (movie_id,rate,time,user_id) from rating_json limit 10;