17-函数&复杂数据类型&Hive分区表
我们的原数据可能是json格式
通常我们的数据格式是text格式:如 id,username,age,gender
如果我们遇到是json格式 如 {"id":1,"username":"ruozedata","age",2,"gender":"unknown"}
如下图:
我们的步骤是
json data ==> hive table ==> sql
上传rating.json文件到/home/hadoop中
create table rating_json(json string);
load data local inpath "/home/hadoop/rating.json" into table rating_json;
select * from rating_json limit 10;
select json_tuple(json,"movie","rate","time","userid") from rating_json limit 10;
select json_tuple(json,"movie","rate","time","userid") as (movie_id,rate,time,user_id) from rating_json limit 10;