99999久久久久久亚洲,欧美人与禽猛交狂配,高清日韩av在线影院,一个人在线高清免费观看,啦啦啦在线视频免费观看www

熱線電話:13121318867

登錄
首頁精彩閱讀CDA:Trifacta通過服務(wù)簡化數(shù)據(jù)整理方式
CDA:Trifacta通過服務(wù)簡化數(shù)據(jù)整理方式
2016-02-07
收藏


CDA:Trifacta通過服務(wù)簡化數(shù)據(jù)整理方式

Trifacta是一種提供數(shù)據(jù)分析服務(wù)的平臺,最近獲得了風(fēng)險(xiǎn)投資以推動其能使數(shù)據(jù)分析更容易地做數(shù)據(jù)整理的工作。它的目標(biāo)是能夠比目前更快、更容易地收集、清理和轉(zhuǎn)換數(shù)據(jù)。

Trifacta


數(shù)據(jù)整理(Data wrangling)一直是每個大數(shù)據(jù)項(xiàng)目中最耗費(fèi)時間和最令人痛苦的部分。在我們這個時代,數(shù)據(jù)是流動的、異構(gòu)的,作為數(shù)據(jù)源其屬性會不斷變化。 NoSQL數(shù)據(jù)庫一直都嘗試解答在存儲方面是使用基于列式存儲還是基于文檔型存儲,但問題依然是如何收集數(shù)據(jù)和應(yīng)用其語義。

Trifacta以用戶為中心的角度而不是以程序員的角度去解決問題。業(yè)務(wù)分析師和數(shù)據(jù)科學(xué)家將能使用可視化的方式去清洗數(shù)據(jù)集?;诓死中:退固垢4髮W(xué)的研究,該平臺的目的是使員工和機(jī)器一起合作,以從數(shù)據(jù)集中提取數(shù)據(jù)。

使用可視化的方式我們可以從大數(shù)據(jù)集中自動化采樣數(shù)據(jù),這讓分析師可以在很短的時間發(fā)現(xiàn)有趣的模式。Trifacta可以應(yīng)用機(jī)器學(xué)習(xí)算法為重新組織信息和整理提供建議。大數(shù)據(jù)分析師可以將數(shù)據(jù)集分組為信息的邏輯部分,每次將其規(guī)范化,并在其工作過程中以友好的界面方式顯示。歸納概括整個數(shù)據(jù)集合是最后一個步驟,這將最終形成半結(jié)構(gòu)化的數(shù)據(jù)集并最終成形。該平臺是在底層設(shè)計(jì)時考慮到用戶的體驗(yàn),讓數(shù)據(jù)分析師能專注于數(shù)據(jù)的處理,而無需開發(fā)復(fù)雜的管道去清理數(shù)據(jù)和把它們放入數(shù)據(jù)倉庫。

Trifacta的項(xiàng)目前身DataWrangler 和相關(guān)研究文章都可以在線獲取并可以從中了解Trifacta是如何實(shí)現(xiàn)的,因?yàn)樗鼈兡壳耙廊惶幱诜忾]的beta測試階段,所以只能通過預(yù)約邀請的方式進(jìn)行演示。

Trifacta Seeks to Simplify Data Wrangling-as-a-Service

Trifacta, a data analysis services platform, recently received VC investment to advance on their efforts of making data wrangling easier for data analysts. The goal is to collect, cleanse and munge data in a fraction of the time and effort it currently takes.

Data wrangling has traditionally been the most time consuming and painful part of every Big Data project. In our era, data is flowing, heterogeneous and constantly changing attributes as data sources are evolving. NoSQL databases have long tried to answer this question in the storage side by being column based or document based but the problem still remains in getting the data collected and applying semantics to it.

Trifacta is approaching the problem from a user centric perspective, instead of a developer one. Business analysts and data scientists will be able to cleanse datasets in a visual oriented way. Based on research at Berkeley and Stanford, the platform aims to make employees and machines collaborate together in extracting insights from datasets.

Automated smart sampling from big data sets together with visualization allows for the analyst to discover interesting patterns at a fraction of the time. Trifacta can then apply machine learning algorithms to suggest ways to reorganize information and get it into shape. The analyst can group the dataset into logical parts of information, normalizing it one step at a time and viewing the outcome in a user friendly way along its course of work. Generalizing in the whole dataset is the last step which turns the semi-structured dataset into shape. The platform is designed from ground up with user experience in mind to allow data analysts to shift in depth through data, without the need to develop complex pipelines to cleanse the data and bring them into the Data Warehouse.

Trifacta’s predecessor research project, DataWrangler and the research paper are available online and can give a sneak preview of what Trifacta is getting to, since they are still in a closed beta, only scheduling demos by invitation.


數(shù)據(jù)分析咨詢請掃描二維碼

若不方便掃碼,搜微信號:CDAshujufenxi

數(shù)據(jù)分析師資訊
更多

OK
客服在線
立即咨詢
客服在線
立即咨詢
') } function initGt() { var handler = function (captchaObj) { captchaObj.appendTo('#captcha'); captchaObj.onReady(function () { $("#wait").hide(); }).onSuccess(function(){ $('.getcheckcode').removeClass('dis'); $('.getcheckcode').trigger('click'); }); window.captchaObj = captchaObj; }; $('#captcha').show(); $.ajax({ url: "/login/gtstart?t=" + (new Date()).getTime(), // 加隨機(jī)數(shù)防止緩存 type: "get", dataType: "json", success: function (data) { $('#text').hide(); $('#wait').show(); // 調(diào)用 initGeetest 進(jìn)行初始化 // 參數(shù)1:配置參數(shù) // 參數(shù)2:回調(diào),回調(diào)的第一個參數(shù)驗(yàn)證碼對象,之后可以使用它調(diào)用相應(yīng)的接口 initGeetest({ // 以下 4 個配置參數(shù)為必須,不能缺少 gt: data.gt, challenge: data.challenge, offline: !data.success, // 表示用戶后臺檢測極驗(yàn)服務(wù)器是否宕機(jī) new_captcha: data.new_captcha, // 用于宕機(jī)時表示是新驗(yàn)證碼的宕機(jī) product: "float", // 產(chǎn)品形式,包括:float,popup width: "280px", https: true // 更多配置參數(shù)說明請參見:http://docs.geetest.com/install/client/web-front/ }, handler); } }); } function codeCutdown() { if(_wait == 0){ //倒計(jì)時完成 $(".getcheckcode").removeClass('dis').html("重新獲取"); }else{ $(".getcheckcode").addClass('dis').html("重新獲取("+_wait+"s)"); _wait--; setTimeout(function () { codeCutdown(); },1000); } } function inputValidate(ele,telInput) { var oInput = ele; var inputVal = oInput.val(); var oType = ele.attr('data-type'); var oEtag = $('#etag').val(); var oErr = oInput.closest('.form_box').next('.err_txt'); var empTxt = '請輸入'+oInput.attr('placeholder')+'!'; var errTxt = '請輸入正確的'+oInput.attr('placeholder')+'!'; var pattern; if(inputVal==""){ if(!telInput){ errFun(oErr,empTxt); } return false; }else { switch (oType){ case 'login_mobile': pattern = /^1[3456789]\d{9}$/; if(inputVal.length==11) { $.ajax({ url: '/login/checkmobile', type: "post", dataType: "json", data: { mobile: inputVal, etag: oEtag, page_ur: window.location.href, page_referer: document.referrer }, success: function (data) { } }); } break; case 'login_yzm': pattern = /^\d{6}$/; break; } if(oType=='login_mobile'){ } if(!!validateFun(pattern,inputVal)){ errFun(oErr,'') if(telInput){ $('.getcheckcode').removeClass('dis'); } }else { if(!telInput) { errFun(oErr, errTxt); }else { $('.getcheckcode').addClass('dis'); } return false; } } return true; } function errFun(obj,msg) { obj.html(msg); if(msg==''){ $('.login_submit').removeClass('dis'); }else { $('.login_submit').addClass('dis'); } } function validateFun(pat,val) { return pat.test(val); }