pandas_melt

原型:

pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)

参数:

  • frame : DataFrame

  • id_vars : tuple, list, or ndarray, optional

    Column(s) to use as identifier variables.

  • value_vars : tuple, list, or ndarray, optional

    Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.

  • var_name : scalar

    Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.

  • value_name : scalar, default ‘value’

    Name to use for the ‘value’ column.

  • col_level : int or string, optional

    If columns are a MultiIndex then use this level to melt.

举个简单的例子:


1
2
3
4
5
6
7
8
9
10
11
12
13
In [1]: import pandas as pd
...: df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c
...: '},
...: 'B': {0: 1, 1: 3, 2: 5},
...: 'C': {0: 2, 1: 4, 2: 6}})
...:
In [2]: df
Out[2]:
A B C
0 a 1 2
1 b 3 4
2 c 5 6

id_vars - 需要保留的原始列

首先使用 id_vars 试一试,可以看出列 A 保持原样,其他 列 均转化成 行

1
2
3
4
5
6
7
8
9
In [3]: pd.melt(df, id_vars=['A'])
Out[3]:
A variable value
0 a B 1
1 b B 3
2 c B 5
3 a C 2
4 b C 4
5 c C 6

value_vars - 需要把 列 转成 行 的列名

使用 value_vars 转换 B 列,输出数据中只有列 B 保留

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
In [4]: pd.melt(df, id_vars= ['A'], value_vars = ['B']
...: )
Out[4]:
A variable value
0 a B 1
1 b B 3
2 c B 5
In [5]: pd.melt(df, id_vars = ['A'], value_vars = ['B'
...: , 'C'])
Out[5]:
A variable value
0 a B 1
1 b B 3
2 c B 5
3 a C 2
4 b C 4
5 c C 6

var_name - 为列转成行之后的列名进行重命名,默认 variable

value_name - 为为列转成行之后的列变量进行重命名,默认 value

先试一下重命名为 varNameTest 和 valueNameTest

1
2
3
4
5
6
7
8
9
10
11
In [6]: pd.melt(df, id_vars = ['A'], value_vars = ['B'
...: , 'C'], var_name = 'varNameTest', value_name =
...: 'valueNameTest')
Out[6]:
A varNameTest valueNameTest
0 a B 1
1 b B 3
2 c B 5
3 a C 2
4 b C 4
5 c C 6

接下来试一下 multi-index columns

1
2
3
4
5
6
7
8
9
In [7]: df.columns = [list('ABC'), list('DEF')]
In [8]: df
Out[8]:
A B C
D E F
0 a 1 2
1 b 3 4
2 c 5 6

这里会需要 col_level 参数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
In [9]: pd.melt(df, col_level = 0, id_vars = ['A'], va
...: lue_vars = ['B'])
Out[9]:
A variable value
0 a B 1
1 b B 3
2 c B 5
In [10]: pd.melt(df, id_vars = [('A','D')], value_vars
...: = [('B','E')])
Out[10]:
(A, D) variable_0 variable_1 value
0 a B E 1
1 b B E 3
2 c B E 5