2018-11-07
閱讀量:
947
關(guān)于合成新數(shù)據(jù)庫(kù)的問(wèn)題
R語(yǔ)言怎么把數(shù)據(jù)框每列分割成兩列,然后合并成新的數(shù)據(jù)框?
比如數(shù)據(jù)如下:
kk <- matrix(c("CG","CC","GG","GG","CG","CG","CC","CG","CG","CC","GG","GG"),3,4)
kk <- as.data.frame(kk)
把每一列的字符串分割成兩個(gè)字符,然后各自成為新的列,再進(jìn)行合并。
library(stringr)
split <- function(temp){
??return(str_split_fixed(temp, "", 2))
}
jj <- matrix(0,nrow(kk), ncol(kk))
for(i in 1:ncol(kk)){
??temp <- split(kk[,i])
??jj <- cbind(jj, temp)
}
jj <- jj[,-c(1:4)]
矩陣cbind循環(huán)越往后越慢,而且基因型數(shù)據(jù)通常又很大。
用list存儲(chǔ),再解成matrix,速度就快得多,試一下。
nrows <- 2000
kk <- matrix(rep(c("CG","CC","GG","GG","CG","CG","CC","CG","CG","CC","GG","GG"), 10000),nrows, byrow=T) %>%
??as.data.frame
jj <- list()
for(i in 1:ncol(kk)){
??jj[[i]] <- kk[,i] %>% str_split_fixed("", 2)
}
jj <- jj %>% unlist %>%
??matrix(nrow=nrows, byrow=T) %>%
??data.frame(stringsAsFactors=F)






評(píng)論(0)


暫無(wú)數(shù)據(jù)
CDA考試動(dòng)態(tài)
CDA報(bào)考指南
推薦帖子
0條評(píng)論
0條評(píng)論
0條評(píng)論