reshape2包是由Hadley Wickham開發(fā)的用于數(shù)據(jù)重構(gòu)的包,其主要功能函數(shù)為melt、cast,實現(xiàn)了長數(shù)據(jù)和寬數(shù)據(jù)之間的轉(zhuǎn)換。
install.packages(reshape2)
library("reshape2")
寬數(shù)據(jù):每一列為一個觀測變量,每一行為一組所有觀測變量的觀測值
>head(airquality) #看重的是一次觀測的各個變量相對應(yīng)的觀測值
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6
長數(shù)據(jù):將每一個觀測變量的觀測值分開存儲,側(cè)重對單個變量進行分析
>airquality1 <- melt(airquality)
No id variables; using all as measure variables
>head(airquality1) #variable列用于存放觀測變量,value列用于存放觀測變量對應(yīng)的觀測值
variable value
1 Ozone 41
2 Ozone 36
3 Ozone 12
4 Ozone 18
5 Ozone NA
6 Ozone 28
melt函數(shù):melt意為“融化”,將寬數(shù)據(jù)轉(zhuǎn)化為長數(shù)據(jù)
melt(data, id.vars, measure.vars, variable.name = “variable”, …, na.rm = FALSE, value.name = “value”, factorsAsStrings = TRUE)
參數(shù):
id.vars:用于指定標(biāo)識變量,根據(jù)標(biāo)識標(biāo)量對其它變量進行“融化”,標(biāo)識變量本身不進行“融化”
measure.vars:用于指定測量變量,對測量變量進行“融化”,其它變量不進行“融化”
若只指定了id.vars和measure.vars中的一項,則把指定之外的變量作為另一項
若兩者都未指定,則把因子和字符串類型的變量作為id.vars,其余變量作為measure.vars
#指定Month和Day為標(biāo)識變量
>airquality2 <- melt(airquality, id.vars = c('Month', 'Day'))
>head(airquality2)
Month Day variable value
1 5 1 Ozone 41
2 5 2 Ozone 36
3 5 3 Ozone 12
4 5 4 Ozone 18
5 5 5 Ozone NA
6 5 6 Ozone 28
#指定Species為測量變量
>iris1 <- melt(iris, measure.vars = 'Species')
>head(iris1)
Sepal.Length Sepal.Width Petal.Length Petal.Width variable value
1 5.1 3.5 1.4 0.2 Species setosa
2 4.9 3.0 1.4 0.2 Species setosa
3 4.7 3.2 1.3 0.2 Species setosa
4 4.6 3.1 1.5 0.2 Species setosa
5 5.0 3.6 1.4 0.2 Species setosa
6 5.4 3.9 1.7 0.4 Species setosa
cast函數(shù):cast意為“鑄造”,將長數(shù)據(jù)轉(zhuǎn)化為寬數(shù)據(jù)
cast函數(shù)有兩種形式:acast:返回向量/矩陣/數(shù)組,dcast:返回數(shù)據(jù)框
dcast(data, formula, fun.aggregate = NULL, …, margins = NULL, subset = NULL, fill = NULL, drop = TRUE, value.var = guess_value(data))
參數(shù):
formula:“鑄造”公式,為函數(shù)的核心參數(shù),函數(shù)根據(jù)公式進行“鑄造”,公式形式為x_variable + x_2 ~ y_variable + y_2,左邊為標(biāo)識變量,右邊為測量變量,類似于melt函數(shù)中的id.vars參數(shù)和measure.vars參數(shù)
fun.aggregate:聚集函數(shù),如mean、median、sum
fill:用于填充缺失值的值
drop:默認(rèn)為TRUE,是否刪除缺失的組合
長數(shù)據(jù)“鑄造”為寬數(shù)據(jù),指定“鑄造”公式
>head(airquality2)
Month Day variable value
1 5 1 Ozone 41
2 5 2 Ozone 36
3 5 3 Ozone 12
4 5 4 Ozone 18
5 5 5 Ozone NA
6 5 6 Ozone 28
>airquality3 <-dcast(airquality2, Month + Day ~ variable)
>head(airquality3)
Month Day Ozone Solar.R Wind Temp
1 5 1 41 190 7.4 67
2 5 2 36 118 8.0 72
3 5 3 12 149 12.6 74
4 5 4 18 313 11.5 62
5 5 5 NA NA 14.3 56
6 5 6 28 NA 14.9 66
>head(airquality)
Ozone Solar.R Wind Temp Month Day
1 41 190 7.4 67 5 1
2 36 118 8.0 72 5 2
3 12 149 12.6 74 5 3
4 18 313 11.5 62 5 4
5 NA NA 14.3 56 5 5
6 28 NA 14.9 66 5 6








暫無數(shù)據(jù)