原型:
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
参数:
frame
: DataFrame
id_vars
: tuple, list, or ndarray, optional
Column(s) to use as identifier variables.
value_vars
: tuple, list, or ndarray, optional
Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.
var_name
: scalar
Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.
value_name
: scalar, default ‘value’
Name to use for the ‘value’ column.
col_level
: int or string, optional
If columns are a MultiIndex then use this level to melt.
举个简单的例子:
1 2 3 4 5 6 7 8 9 10 11 12 13
| In [1]: import pandas as pd ...: df = pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c ...: '}, ...: 'B': {0: 1, 1: 3, 2: 5}, ...: 'C': {0: 2, 1: 4, 2: 6}}) ...: In [2]: df Out[2]: A B C 0 a 1 2 1 b 3 4 2 c 5 6
|
id_vars
- 需要保留的原始列
首先使用 id_vars
试一试,可以看出列 A 保持原样,其他 列 均转化成 行
1 2 3 4 5 6 7 8 9
| In [3]: pd.melt(df, id_vars=['A']) Out[3]: A variable value 0 a B 1 1 b B 3 2 c B 5 3 a C 2 4 b C 4 5 c C 6
|
value_vars
- 需要把 列 转成 行 的列名
使用 value_vars 转换 B 列,输出数据中只有列 B 保留
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| In [4]: pd.melt(df, id_vars= ['A'], value_vars = ['B'] ...: ) Out[4]: A variable value 0 a B 1 1 b B 3 2 c B 5 In [5]: pd.melt(df, id_vars = ['A'], value_vars = ['B' ...: , 'C']) Out[5]: A variable value 0 a B 1 1 b B 3 2 c B 5 3 a C 2 4 b C 4 5 c C 6
|
var_name
- 为列转成行之后的列名进行重命名,默认 variable
value_name
- 为为列转成行之后的列变量进行重命名,默认 value
先试一下重命名为 varNameTest 和 valueNameTest
1 2 3 4 5 6 7 8 9 10 11
| In [6]: pd.melt(df, id_vars = ['A'], value_vars = ['B' ...: , 'C'], var_name = 'varNameTest', value_name = ...: 'valueNameTest') Out[6]: A varNameTest valueNameTest 0 a B 1 1 b B 3 2 c B 5 3 a C 2 4 b C 4 5 c C 6
|
接下来试一下 multi-index columns
1 2 3 4 5 6 7 8 9
| In [7]: df.columns = [list('ABC'), list('DEF')] In [8]: df Out[8]: A B C D E F 0 a 1 2 1 b 3 4 2 c 5 6
|
这里会需要 col_level
参数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| In [9]: pd.melt(df, col_level = 0, id_vars = ['A'], va ...: lue_vars = ['B']) Out[9]: A variable value 0 a B 1 1 b B 3 2 c B 5 In [10]: pd.melt(df, id_vars = [('A','D')], value_vars ...: = [('B','E')]) Out[10]: (A, D) variable_0 variable_1 value 0 a B E 1 1 b B E 3 2 c B E 5
|