How to transpose/pivot data in hive?

hive collect_list
treasure data hive create table
explode in hive
map function in hive
presto pivot
hive posexplode
hive named_struct
hive percentile

I know there's no direct way to transpose data in hive. I followed this question: Is there a way to transpose data in Hive? , but as there is no final answer there, could not get all the way.

This is the table I have:

 | ID   |   Code   |  Proc1   |   Proc2 | 
 | 1    |    A     |   p      |   e     | 
 | 2    |    B     |   q      |   f     |
 | 3    |    B     |   p      |   f     |
 | 3    |    B     |   q      |   h     |
 | 3    |    B     |   r      |   j     |
 | 3    |    C     |   t      |   k     |

Here Proc1 can have any number of values. ID, Code & Proc1 together form a unique key for this table. I want to Pivot/ transpose this table so that each unique value in Proc1 becomes a new column, and corresponding value from Proc2 is the value in that column for the corresponding row. In essense, I'm trying to get something like:

 | ID   |   Code   |  p   |   q |  r  |   t |
 | 1    |    A     |   e  |     |     |     |
 | 2    |    B     |      |   f |     |     |
 | 3    |    B     |   f  |   h |  j  |     |
 | 3    |    C     |      |     |     |  k  |

In the new transformed table, ID and code are the only primary key. From the ticket I mentioned above, I could get this far using the to_map UDAF. (Disclaimer - this may not be a step in the right direction, but just mentioning here, if it is)

 | ID   |   Code   |  Map_Aggregation   | 
 | 1    |    A     |   {p:e}            |
 | 2    |    B     |   {q:f}            |
 | 3    |    B     |   {p:f, q:h, r:j } |  
 | 3    |    C     |   {t:k}            |

But don't know how to get from this step to the pivot/transposed table I want. Any help on how to proceed will be great! Thanks.


Here is the approach i used to solved this problem using hive's internal UDF function, "map":

select
    b.id,
    b.code,
    concat_ws('',b.p) as p,
    concat_ws('',b.q) as q,
    concat_ws('',b.r) as r,
    concat_ws('',b.t) as t
from 
    (
        select id, code,
        collect_list(a.group_map['p']) as p,
        collect_list(a.group_map['q']) as q,
        collect_list(a.group_map['r']) as r,
        collect_list(a.group_map['t']) as t
        from (
            select
              id,
              code,
              map(proc1,proc2) as group_map 
            from 
              test_sample
        ) a
        group by
            a.id,
            a.code
    ) b;

"concat_ws" and "map" are hive udf and "collect_list" is a hive udaf.

How to transpose/pivot data in hive?, I want to Pivot/ transpose this table so that each unique value in Proc1 becomes a new column, and  In this article, we will learn how can we pivot rows to columns in the Hive. Pivoting/transposing means we need to convert a row into columns. We need to do this to show a different view of data, to show aggregation performed on different granularity than which is present in the existing table. Consider you have the following data from some company.


Yet another solution.

Pivot using Hivemall to_map function.

SELECT
  uid,
  kv['c1'] AS c1,
  kv['c2'] AS c2,
  kv['c3'] AS c3
FROM (
  SELECT uid, to_map(key, value) kv
  FROM vtable
  GROUP BY uid
) t

uid c1 c2 c3 101 11 12 13 102 21 22 23

Unpivot

SELECT t1.uid, t2.key, t2.value
FROM htable t1
LATERAL VIEW explode (map(
  'c1', c1,
  'c2', c2,
  'c3', c3
)) t2 as key, value

uid key value 101 c1 11 101 c2 12 101 c3 13 102 c1 21 102 c2 22 102 c3 23

How to transpose/pivot data in hive?, Here is the approach i used to solved this problem using hive's internal UDF function, "map": I want to Pivot/ transpose this table so that each unique value in Proc1 becomes a new column, and corresponding value from Proc2 is the value in that column for the corresponding row. In essense, I'm trying to get something like: In the new transformed table, ID and code are the only primary key.


Here is the solution I ended up using:

add jar brickhouse-0.7.0-SNAPSHOT.jar;
CREATE TEMPORARY FUNCTION collect AS 'brickhouse.udf.collect.CollectUDAF';

select 
    id, 
    code,
    group_map['p'] as p,
    group_map['q'] as q,
    group_map['r'] as r,
    group_map['t'] as t
    from ( select
        id, code,
        collect(proc1,proc2) as group_map 
        from test_sample 
        group by id, code
    ) gm;

The to_map UDF was used from the brickhouse repo: https://github.com/klout/brickhouse

Pivot Rows to Columns in Hive, Pivoting/transposing means we need to convert a row into columns. We need to do this to show a different view of data,  Transpose & Pivot in Hive Query can be achieved using multi-stage process. You can use collect_list() or collect_set() function and merge the multiple rows into columns and then get the result. You can use collect_list() or collect_set() function and merge the multiple rows into columns and then get the result.


I have not written this code, but I think you can use some of the UDFs provided by klouts brickhouse: https://github.com/klout/brickhouse

Specifically, you could do something like use their collect as mentioned here: http://brickhouseconfessions.wordpress.com/2013/03/05/use-collect-to-avoid-the-self-join/

and then explode the arrays (they will be of differing length) using the methods detailed in this post http://brickhouseconfessions.wordpress.com/2013/03/07/exploding-multiple-arrays-at-the-same-time-with-numeric_ra

Solved: Hive:Transpose the set of rows, TRANSPOSE/PIVOT a Table in Hive. Transposing/pivoting a table means to convert values of https://stackoverflow.com/questions/37436710/is-there-a-way​-to-transpose-data-in-hive. To create the pivot table, you need to add the Category and Part Name as rows and Price as values. This will create the pivot table. By default, Excel presents the pivot table in a compact layout. But this may not always be the best representation for your data.


  1. I have created one dummy table called hive using below query-

create table hive (id Int,Code String, Proc1 String, Proc2 String);

  1. Loaded all the data in the table-
insert into hive values('1','A','p','e');
insert into hive values('2','B','q','f'); 
insert into hive values('3','B','p','f');
insert into hive values('3','B','q','h');
insert into hive values('3','B','r','j');
insert into hive values('3','C','t','k');
  1. Now use the below query to achieve the output.
select id,code,
     case when collect_list(p)[0] is null then '' else collect_list(p)[0] end as p,
     case when collect_list(q)[0] is null then '' else collect_list(q)[0] end as q,
     case when collect_list(r)[0] is null then '' else collect_list(r)[0] end as r,
     case when collect_list(t)[0] is null then '' else collect_list(t)[0] end as t
     from(
            select id, code,
            case when proc1 ='p' then proc2 end as p,
            case when proc1 ='q' then proc2 end as q,
            case when proc1 ='r' then proc2 end as r,
            case when proc1 ='t' then proc2 end as t
            from hive
        ) dummy group by id,code;

TRANSPOSE/PIVOT a Table in Hive, Hive doesn't have the 'TRANSPOSE' keyword like SQL Server does so you have to fall While there is no command (as of 2016–06–02) TRANSPOSE in Hive, there is a way to pivot and un-pivot data. Below is the JIRA ticket created with apache to include PIVOT option in Hive, you can see the status & comments. Also some links provided in the comment section to manually transpose the column to row/row to column.


Is there a way to transpose data in Hive?, Use of PIVOT / UNPIVOT. You can use the PIVOT and UNPIVOT operators in standard SQL, Hive,  While there is no command (as of 2016–06–02) TRANSPOSE in Hive, there is a way to pivot and un-pivot data. You can construct a query like this to transpose it: And now we’re back to our original dataset. Just a note, there has been an open JIRA ticket (support PIVOT in hive) for quite sometime now.


Hive/Presto/Standard SQL Tips – Arm Treasure Data, Pivot a Hive table. 6 minute read. Pivoting a table is a handy function in much of our data analysis, and  If you have a worksheet with data in columns that you need to rotate to rearrange it in rows, use the Transpose feature. With it, you can quickly switch data from columns to rows, or vice versa. With it, you can quickly switch data from columns to rows, or vice versa.


Pivot a Hive table, I followed this question: Is there a way to transpose data in Hive? , but as there Pivot using Hivemall to_map function. FROM - Using PIVOT and UNPIVOT. You can use the PIVOT and UNPIVOT relational operators to change a table-valued expression into another table. PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output.