99999久久久久久亚洲,欧美人与禽猛交狂配,高清日韩av在线影院,一个人在线高清免费观看,啦啦啦在线视频免费观看www

熱線電話:13121318867

登錄
首頁(yè)精彩閱讀R語(yǔ)言作為BI中ETL的工具
R語(yǔ)言作為BI中ETL的工具
2017-05-14
收藏

R語(yǔ)言作為BI中ETL的工具

R語(yǔ)言作為BI中ETL的工具,增刪改

R語(yǔ)言提供了強(qiáng)大的R_package與各種數(shù)據(jù)庫(kù)進(jìn)行數(shù)據(jù)交互。
外加其強(qiáng)大數(shù)據(jù)變換清洗函數(shù),為ETL提供一條方便快捷的道路。
RODBC
ROracal
RMysql
Rmongodb

    step1 新建連接con,并查看其信息

library(RODBC)
con<-odbcConnect("LI")
con  

RODBC Connection 1
Details:
  case=nochange
  DSN=LI
  UID=
  Trusted_Connection=Yes
  APP=RStudio
  WSID=LIYI-PC

    step2 引入數(shù)據(jù)集USArrests

data(USArrests)
head(USArrests)
           Murder Assault UrbanPop Rape
Alabama      13.2     236       58 21.2
Alaska       10.0     263       48 44.5
Arizona       8.1     294       80 31.0
Arkansas      8.8     190       50 19.5
California    9.0     276       91 40.6
Colorado      7.9     204       78 38.7

    step3 將USArrests保存到sqlserver

sqlSave(con, USArrests)

sqlSave(channel, dat, tablename = NULL, append = FALSE,
        rownames = TRUE, colnames = FALSE, verbose = FALSE,
        safer = TRUE, addPK = FALSE, typeInfo, varTypes,
        fast = TRUE, test = FALSE, nastring = NULL)
channel      數(shù)據(jù)庫(kù)連接通道
dat              data frame.要存入的數(shù)據(jù)集
tablename  character 數(shù)據(jù)庫(kù)中表名
index          character 索引列的名字
append       logical邏輯變量 ,是否數(shù)據(jù)集添加/寫(xiě)入已存在的表
rownames  logical 邏輯變量 or character字符串,如果是logical,表示是否把rowname這個(gè)字符串作為數(shù)據(jù)庫(kù)表首列列名,如果是字符串,則將新字符串作為表首列列名
colnames  logical 邏輯變量 是否將數(shù)據(jù)集的列名保留作為表的首行 (謹(jǐn)慎更改,可能導(dǎo)致,列名變成數(shù)據(jù)第一行,各列的數(shù)據(jù)類(lèi)型全部變?yōu)関archar(255)
verbose   display statements as they are sent to the server?
safer     logical邏輯變量.如果真,生成一個(gè)新表,不在已存在的表后添加。如果假,強(qiáng)制刪除已存在的同名表并新建,或者刪除表中已存在的數(shù)據(jù),覆蓋寫(xiě)入
addPK     logical邏輯變量,是否將首列作為主鍵
typeInfo  list 數(shù)據(jù)框中數(shù)據(jù)類(lèi)型。包括character ,double ,integer
varTypes  an optional named character vector giving the DBMSs datatypes to be used for some (or all) of the columns if a table is to be created.可選項(xiàng)涉及各列數(shù)據(jù)類(lèi)型轉(zhuǎn)換,因?yàn)閿?shù)據(jù)庫(kù)中的數(shù)據(jù)類(lèi)型比R語(yǔ)言中的要多很多。
fast      logical. If false, write data a row at a time. If true, use a parametrized INSERT INTO or UPDATE query to write all the data in one operation. 邏輯變量,如果F,數(shù)據(jù)將一次一行地寫(xiě)入,如果為T(mén)ure,將用到變量插入INSERT INTO,或者UPDATE 將數(shù)據(jù)一次性寫(xiě)入
nastring     optional character string to be used for writing NAs to the database. 選擇哪種字符串,將缺失項(xiàng)在數(shù)據(jù)庫(kù)中填充

getSqlTypeInfo("Microsoft SQL Server")
$double
[1] "float"

$integer
[1] "int"

$character
[1] "varchar(255)"

$logical
[1] "varchar(5)"

sqlSave(con, USArrests,rownames = "city", addPK = T) # 原沒(méi)有列名的rownames改名為city,并設(shè)置首列為主鍵key

sqlSave(con, USArrests,'USA2',rownames = "city", addPK = T,fast=T,test=T)  #注意此操作可能在數(shù)據(jù)庫(kù)中create名為USA2的空表

Binding: 'city' DataType 12, ColSize 255
Binding: 'Murder' DataType 6, ColSize 53
Binding: 'Assault' DataType 4, ColSize 10
Binding: 'UrbanPop' DataType 4, ColSize 10
Binding: 'Rape' DataType 6, ColSize 53
Parameters:
no: 1: city Alabama/***/no: 2: Murder 13.2/***/no: 3: Assault 236/***/no: 4: UrbanPop 58/***/no: 5: Rape 21.2/***/
no: 1: city Alaska/***/no: 2: Murder 10/***/no: 3: Assault 263/***/no: 4: UrbanPop 48/***/no: 5: Rape 44.5/***/
no: 1: city Arizona/***/no: 2: Murder 8.1/***/no: 3: Assault 294/***/no: 4: UrbanPop 80/***/no: 5: Rape 31/***/
no: 1: city Arkansas/***/no: 2: Murder 8.8/***/no: 3: Assault 190/***/no: 4: UrbanPop 50/***/no: 5: Rape 19.5/***/
no: 1: city California/***/no: 2: Murder 9/***/no: 3: Assault 276/***/no: 4: UrbanPop 91/***/no: 5: Rape 40.6/***/
no: 1: city Colorado/***/no: 2: Murder 7.9/***/no: 3: Assault 204/***/no: 4: UrbanPop 78/***/no: 5: Rape 38.7/***/
no: 1: city Connecticut/***/no: 2: Murder 3.3/***/no: 3: Assault 110/***/no: 4: UrbanPop 77/***/no: 5: Rape 11.1/***/
no: 1: city Delaware/***/no: 2: Murder 5.9/***/no: 3: Assault 238/***/no: 4: UrbanPop 72/***/no: 5: Rape 15.8/***/
no: 1: city Florida/***/no: 2: Murder 15.4/***/no: 3: Assault 335/***/no: 4: UrbanPop 80/***/no: 5: Rape 31.9/***/
no: 1: city Georgia/***/no: 2: Murder 17.4/***/no: 3: Assault 211/***/no: 4: UrbanPop 60/***/no: 5: Rape 25.8/***/
no: 1: city Hawaii/***/no: 2: Murder 5.3/***/no: 3: Assault 46/***/no: 4: UrbanPop 83/***/no: 5: Rape 20.2/***/
no: 1: city Idaho/***/no: 2: Murder 2.6/***/no: 3: Assault 120/***/no: 4: UrbanPop 54/***/no: 5: Rape 14.2/***/
no: 1: city Illinois/***/no: 2: Murder 10.4/***/no: 3: Assault 249/***/no: 4: UrbanPop 83/***/no: 5: Rape 24/***/
no: 1: city Indiana/***/no: 2: Murder 7.2/***/no: 3: Assault 113/***/no: 4: UrbanPop 65/***/no: 5: Rape 21/***/
no: 1: city Iowa/***/no: 2: Murder 2.2/***/no: 3: Assault 56/***/no: 4: UrbanPop 57/***/no: 5: Rape 11.3/***/
no: 1: city Kansas/***/no: 2: Murder 6/***/no: 3: Assault 115/***/no: 4: UrbanPop 66/***/no: 5: Rape 18/***/
# 此處省略10000字

sqlColumns    Enquire about the column structure of tables on an ODBC database connection. 訪問(wèn)數(shù)據(jù)庫(kù)表的結(jié)構(gòu)

columnsenquire<-sqlColumns(con,'USA2')

str(columnsenquire)

str(columnsenquire)
'data.frame':   5 obs. of  29 variables:
 $ TABLE_CAT                           : chr  "master" "master" "master" "master" ...
 $ TABLE_SCHEM                         : chr  "dbo" "dbo" "dbo" "dbo" ...
 $ TABLE_NAME                          : chr  "USA2" "USA2" "USA2" "USA2" ...
 $ COLUMN_NAME                         : chr  "city" "Murder" "Assault" "UrbanPop" ...
 $ DATA_TYPE                           : int  12 6 4 4 6
 $ TYPE_NAME                           : chr  "varchar" "float" "int" "int" ...
 $ COLUMN_SIZE                         : int  255 53 10 10 53
 $ BUFFER_LENGTH                       : int  255 8 4 4 8
 $ DECIMAL_DIGITS                      : int  NA NA 0 0 NA
 $ NUM_PREC_RADIX                      : int  NA 2 10 10 2
 $ NULLABLE                            : int  0 1 1 1 1
 $ REMARKS                             : chr  NA NA NA NA ...
 $ COLUMN_DEF                          : chr  NA NA NA NA ...
 $ SQL_DATA_TYPE                       : int  12 6 4 4 6
 $ SQL_DATETIME_SUB                    : int  NA NA NA NA NA
 $ CHAR_OCTET_LENGTH                   : int  255 NA NA NA NA
 $ ORDINAL_POSITION                    : int  1 2 3 4 5
 $ IS_NULLABLE                         : chr  "NO" "YES" "YES" "YES" ...
 $ SS_IS_SPARSE                        : int  0 0 0 0 0
 $ SS_IS_COLUMN_SET                    : int  0 0 0 0 0
 $ SS_IS_COMPUTED                      : int  0 0 0 0 0
 $ SS_IS_IDENTITY                      : int  0 0 0 0 0
 $ SS_UDT_CATALOG_NAME                 : chr  NA NA NA NA ...
 $ SS_UDT_SCHEMA_NAME                  : chr  NA NA NA NA ...
 $ SS_UDT_ASSEMBLY_TYPE_NAME           : chr  NA NA NA NA ...
 $ SS_XML_SCHEMACOLLECTION_CATALOG_NAME: chr  NA NA NA NA ...
 $ SS_XML_SCHEMACOLLECTION_SCHEMA_NAME : chr  NA NA NA NA ...
 $ SS_XML_SCHEMACOLLECTION_NAME        : chr  NA NA NA NA ...
 $ SS_DATA_TYPE                        : chr  "39" "109" "38" "38" ...

    step4 查詢(xún)數(shù)據(jù)

    sqlQuery(con,'select * from USArrests')

    # 注意此時(shí)第一列的名字已經(jīng)為city了
             city Murder Assault UrbanPop Rape
    1         Alabama   13.2     236       58 21.2
    2          Alaska   10.0     263       48 44.5
    3         Arizona    8.1     294       80 31.0
    4        Arkansas    8.8     190       50 19.5
    5      California    9.0     276       91 40.6
    6        Colorado    7.9     204       78 38.7
    7     Connecticut    3.3     110       77 11.1
    8        Delaware    5.9     238       72 15.8
    9         Florida   15.4     335       80 31.9
    10        Georgia   17.4     211       60 25.8
    11         Hawaii    5.3      46       83 20.2
    12          Idaho    2.6     120       54 14.2
    13       Illinois   10.4     249       83 24.0
    14        Indiana    7.2     113       65 21.0
    ...

    對(duì)于sql語(yǔ)句可能是以‘XX’ 結(jié)尾則需要用如下形式來(lái)進(jìn)行查詢(xún)

    sqlQuery(con,paste('select * from USArrests',
         "where city = 'Alabama'"))

     city Murder Assault UrbanPop Rape
    1 Alabama   13.2     236       58 21.2

    但是對(duì)于Update,以下卻是失效的

    a<-paste("update [master].[dbo].[USArrests]",
         "set Murder =13.2","where city ='Alabama'")
    sqlQuery(con,a)   # 失效
    sqlUpdate(con,a)  # 失效

    sqlUpdate()  sqlUpdate(channel, dat, tablename = NULL, index = NULL,
          verbose = FALSE, test = FALSE, nastring = NULL,
          fast = TRUE)
    不能進(jìn)行腳本語(yǔ)句直接更新,但是可以進(jìn)行如下操作
    foo <- cbind(city=row.names(USArrests), USArrests)[1:3, c(1,3)]
     foo
           city Assault
    Alabama Alabama     236
    Alaska   Alaska     263
    Arizona Arizona     294
     foo[1,2] <- 200
    foo
           city Assault
    Alabama Alabama     200
    Alaska   Alaska     263
    Arizona Arizona     294
    流程是先選定要更新的行列,將值更新,然后再將值update入庫(kù)

    實(shí)例如下

temp<-sqlQuery(con,paste('select * from USArrests',
"where city = 'Alabama'"))
temp
     city Murder Assault UrbanPop Rape
1 Alabama   13.2     300       58 21.2
  str(temp)
'data.frame':   1 obs. of  5 variables:
 $ city    : Factor w/ 1 level "Alabama": 1
 $ Murder  : num 13.2
 $ Assault : num 300
 $ UrbanPop: int 58
 $ Rape    : num 21.2
  temp[1,]
[1] 300

  temp[1,3]<-200
  sqlUpdate(con, temp, "USArrests")
  sqlQuery(con,paste('select * from USArrests',
"where city = 'Alabama'"))
     city Murder Assault UrbanPop Rape
1 Alabama   13.2     200       58 21.2

sqlFetch(con, "USArrests", rownames = "city", max = 20,rows_at_time = 10)

實(shí)踐后發(fā)現(xiàn),單單對(duì)于簡(jiǎn)單的ETL,sqlQuery,sqlUpdate是足夠了,
寫(xiě)一些for循環(huán)+list.files/liset.dir+reshape/ddplyr/tidyr(進(jìn)行數(shù)據(jù)篩選,清洗,變換),
對(duì)于腳本是否執(zhí)行的問(wèn)題,可以寫(xiě)日志文件,對(duì)ETL過(guò)程進(jìn)行檢測(cè)。
    step5 構(gòu)建日志文件
    監(jiān)督機(jī)制,對(duì)于每次要寫(xiě)入/更新的數(shù)據(jù)進(jìn)行count(originaldata),保存數(shù)據(jù),然后與每次執(zhí)行寫(xiě)入/變更的行數(shù)Rows進(jìn)行比對(duì),比對(duì)結(jié)果result為邏輯變量。將三者連同 執(zhí)行時(shí)間,寫(xiě)成一個(gè)數(shù)據(jù)框,并在數(shù)據(jù)中新建對(duì)應(yīng)的日志表,對(duì)執(zhí)行結(jié)果進(jìn)行監(jiān)督。(對(duì)于追蹤Rows沒(méi)有找到很好的解決方法,send a letter ...)
    暴力美學(xué),對(duì)于sqlSave / sqlQuery 的執(zhí)行結(jié)果進(jìn)行追蹤,成功為“1”,失敗則"ON Errors"
    step6 快速操作數(shù)據(jù)庫(kù)
sqlClear deletes all the rows of the table sqtable. #清楚表中數(shù)據(jù)
sqlDrop removes the table sqtable (if permitted). #刪除表
sqlClear(channel, sqtable, errors = TRUE)
sqlDrop(channel, sqtable, errors = TRUE)

數(shù)據(jù)分析咨詢(xún)請(qǐng)掃描二維碼

若不方便掃碼,搜微信號(hào):CDAshujufenxi

數(shù)據(jù)分析師資訊
更多

OK
客服在線
立即咨詢(xún)
客服在線
立即咨詢(xún)
') } function initGt() { var handler = function (captchaObj) { captchaObj.appendTo('#captcha'); captchaObj.onReady(function () { $("#wait").hide(); }).onSuccess(function(){ $('.getcheckcode').removeClass('dis'); $('.getcheckcode').trigger('click'); }); window.captchaObj = captchaObj; }; $('#captcha').show(); $.ajax({ url: "/login/gtstart?t=" + (new Date()).getTime(), // 加隨機(jī)數(shù)防止緩存 type: "get", dataType: "json", success: function (data) { $('#text').hide(); $('#wait').show(); // 調(diào)用 initGeetest 進(jìn)行初始化 // 參數(shù)1:配置參數(shù) // 參數(shù)2:回調(diào),回調(diào)的第一個(gè)參數(shù)驗(yàn)證碼對(duì)象,之后可以使用它調(diào)用相應(yīng)的接口 initGeetest({ // 以下 4 個(gè)配置參數(shù)為必須,不能缺少 gt: data.gt, challenge: data.challenge, offline: !data.success, // 表示用戶(hù)后臺(tái)檢測(cè)極驗(yàn)服務(wù)器是否宕機(jī) new_captcha: data.new_captcha, // 用于宕機(jī)時(shí)表示是新驗(yàn)證碼的宕機(jī) product: "float", // 產(chǎn)品形式,包括:float,popup width: "280px", https: true // 更多配置參數(shù)說(shuō)明請(qǐng)參見(jiàn):http://docs.geetest.com/install/client/web-front/ }, handler); } }); } function codeCutdown() { if(_wait == 0){ //倒計(jì)時(shí)完成 $(".getcheckcode").removeClass('dis').html("重新獲取"); }else{ $(".getcheckcode").addClass('dis').html("重新獲取("+_wait+"s)"); _wait--; setTimeout(function () { codeCutdown(); },1000); } } function inputValidate(ele,telInput) { var oInput = ele; var inputVal = oInput.val(); var oType = ele.attr('data-type'); var oEtag = $('#etag').val(); var oErr = oInput.closest('.form_box').next('.err_txt'); var empTxt = '請(qǐng)輸入'+oInput.attr('placeholder')+'!'; var errTxt = '請(qǐng)輸入正確的'+oInput.attr('placeholder')+'!'; var pattern; if(inputVal==""){ if(!telInput){ errFun(oErr,empTxt); } return false; }else { switch (oType){ case 'login_mobile': pattern = /^1[3456789]\d{9}$/; if(inputVal.length==11) { $.ajax({ url: '/login/checkmobile', type: "post", dataType: "json", data: { mobile: inputVal, etag: oEtag, page_ur: window.location.href, page_referer: document.referrer }, success: function (data) { } }); } break; case 'login_yzm': pattern = /^\d{6}$/; break; } if(oType=='login_mobile'){ } if(!!validateFun(pattern,inputVal)){ errFun(oErr,'') if(telInput){ $('.getcheckcode').removeClass('dis'); } }else { if(!telInput) { errFun(oErr, errTxt); }else { $('.getcheckcode').addClass('dis'); } return false; } } return true; } function errFun(obj,msg) { obj.html(msg); if(msg==''){ $('.login_submit').removeClass('dis'); }else { $('.login_submit').addClass('dis'); } } function validateFun(pat,val) { return pat.test(val); }