
數(shù)據(jù)越來越多的影響并塑造著那些我們每天都要交互的系統(tǒng)。不管是你使用Siri,google搜索,還是瀏覽facebook的好友動(dòng)態(tài),你都在消費(fèi)者數(shù)據(jù)分析的結(jié)果。我們賦予了數(shù)據(jù)如此大的轉(zhuǎn)變的能力,也難怪近幾年越來越多的數(shù)據(jù)相關(guān)的角色被創(chuàng)造出來。
Data is increasingly shaping the systems that we interact with every day. Whether you’re using Siri, searching Google, or browsing your Facebook feed, you’re consuming the results of data analysis. Given its transformational ability, it’s no wonder that so many data-related roles have been created in the past few years.
這些角色的職責(zé)范圍,從預(yù)測(cè)未來,到發(fā)現(xiàn)你周圍世界的模式,到建設(shè)操作著數(shù)百萬記錄的系統(tǒng)。在這篇文章中。我們將討論不同的數(shù)據(jù)相關(guān)的角色,他們?nèi)绾谓M合在一起,并且?guī)湍阏页瞿切┙巧沁m合你自己的。
The responsibilities of these roles range from predicting the future, to finding patterns in the world around you, to building systems that manipulate millions of records. In this post, we’ll talk about the various data-related roles, how they fit together, and help you figure out which role is the right fit.
什么是數(shù)據(jù)分析師?
What is a data analyst?
數(shù)據(jù)分析通過談?wù)摂?shù)據(jù)來像他們的公司傳遞價(jià)值,用數(shù)據(jù)來回答問題,交流結(jié)果來幫助做商業(yè)決策。數(shù)據(jù)分析師的一般工作包括數(shù)據(jù)清洗,執(zhí)行分析和數(shù)據(jù)可視化。
Data Analysts deliver value to their companies by taking data, using it to answer questions, and communicating the results to help make business decisions. Common tasks done by data analysts include data cleaning, performing analysis and creating data visualizations.
取決于行業(yè),數(shù)據(jù)分析師可能有不同的頭銜(比如:商業(yè)分析師,商業(yè)智能分析師,業(yè)務(wù)/運(yùn)營(yíng)分析師,數(shù)據(jù)分析師)不管頭銜是什么,數(shù)據(jù)分析師是一個(gè)能適應(yīng)不同角色和團(tuán)隊(duì)的多面手以幫助別人做出更好的數(shù)據(jù)驅(qū)動(dòng)的決策。
Depending on the industry, the data analyst could go by a different title (e.g. Business Analyst, Business Intelligence Analyst, Operations Analyst, Database Analyst). Regardless of title, the data analyst is a generalist who can fit into many roles and teams to help others make better data-driven decisions.
深度解析數(shù)據(jù)分析師
The data analyst in depth
數(shù)據(jù)分析師擁有把傳統(tǒng)的商業(yè)方式轉(zhuǎn)換成數(shù)據(jù)驅(qū)動(dòng)的商業(yè)方式的潛質(zhì)。雖然數(shù)據(jù)分析師是數(shù)據(jù)廣泛領(lǐng)域的入門水平,但不是說所有的分析師都是低水平的。數(shù)據(jù)分析師不僅僅精通技術(shù)工具,還是高效的交流者,他們對(duì)于那些把技術(shù)團(tuán)隊(duì)和商業(yè)團(tuán)隊(duì)隔離的公司是至關(guān)重要的。
The data analyst has the potential to turn a traditional business into a data-driven one. While often data analyst positions are ‘entry level’ jobs in the wider field of data, not all analysts are junior level. As effective communicators with a mastery over technical tools, data analysts are critical for companies that have segregated technical and business teams.
他們的核心職責(zé)是幫助其他人追蹤進(jìn)展,和優(yōu)化目標(biāo)。市場(chǎng)人員如何使用分析的數(shù)據(jù)取幫助他們安排下一次活動(dòng)?銷售人員如何衡量哪種類型人群能更好的爭(zhēng)?。緾EO如何更好的理解最最近公司發(fā)展背后潛在原因?這些問題就需要數(shù)據(jù)分析師通過數(shù)據(jù)分析和呈現(xiàn)結(jié)果來給答案。他們從事的這些和數(shù)據(jù)打交道的復(fù)雜工作能夠?yàn)樗麄兯诘慕M織貢獻(xiàn)價(jià)值。
Their core responsibility is to help others track progress and optimize their focus. How can a marketer use analytics data to help launch their next campaign? How can a sales representative better identify which demographics to target? How can a CEO better understand the underlying reasons behind recent company growth? These are all questions that the data analyst provides the answer to by performing analysis and presenting the results. They undertake the complex jobof working with data to deliver value to their organization.
一個(gè)高效的數(shù)據(jù)分析師能夠在商業(yè)決策的時(shí)候摒棄臆想和猜測(cè),并且?guī)椭麄€(gè)組織快速成長(zhǎng)。數(shù)據(jù)分析師必須是一個(gè)橫跨在不同團(tuán)隊(duì)中的有效橋梁。通過分析新的數(shù)據(jù),綜合不同的報(bào)告,翻譯整體的產(chǎn)出。反過來,這也能幫助組織對(duì)于自身的發(fā)展時(shí)刻保持警覺。
An effective data analyst will take the guesswork out of business decisions and help the entire organization thrive. The data analyst must be an effective bridge between different teams by analyzing new data, combining different reports, and translating the outcomes. In turn, this is what allows the organization to maintain an accurate pulse check on its growth.
公司的不同需求決定了數(shù)據(jù)分析師的技能要求,但是下面這些應(yīng)該是通用的:
The nature of the skills required will depend on the company’s specific needs, but these are some common tasks:
清洗和組織未加工的數(shù)據(jù)
使用描述性統(tǒng)計(jì)來得到數(shù)據(jù)的全局視圖
分析在數(shù)據(jù)中發(fā)現(xiàn)的有趣趨勢(shì)
創(chuàng)建數(shù)據(jù)可視化和儀表盤來幫助公司解讀說明和使用數(shù)據(jù)做決策
呈現(xiàn)針對(duì)商業(yè)客戶或者內(nèi)部團(tuán)隊(duì)的科學(xué)分析的結(jié)果
Cleaning and organizing raw data.
Using descriptive statistics to get a big-picture view of their data.
Analyzing interesting trends found in the data.
Creating visualizations and dashboards to help the company interpret and make decisions with the data.
Presenting the results of a technical analysis to business clients or internal teams.
數(shù)據(jù)分析師對(duì)公司科技和分科技的兩面都帶來了重大的價(jià)值。不管是進(jìn)行探索性的分析還是解讀經(jīng)營(yíng)狀況的儀表盤。分析師都促進(jìn)了團(tuán)隊(duì)之間更緊密的連接。
The data analyst brings significant value to both the technical and non-technical sides of an organization. Whether running exploratory analyses or explaining executive dashboards, the analyst fosters greater connection between teams.
什么是數(shù)據(jù)科學(xué)家?
What is a data scientist?
數(shù)據(jù)科學(xué)家是使用他們?cè)诮y(tǒng)計(jì)學(xué)和建設(shè)機(jī)器學(xué)習(xí)模型方面的專業(yè)技術(shù)去進(jìn)行關(guān)鍵商業(yè)問題預(yù)測(cè)的專家。
A data scientist is a specialist that applies their expertise in statistics and building machine learning models to make predictions and answer key business questions.
數(shù)據(jù)科學(xué)家也需要像數(shù)據(jù)分析師一樣去清洗、分析、可視化數(shù)據(jù)。然而一個(gè)數(shù)據(jù)科學(xué)家需要在這些技能上更深入也更專業(yè),他們還可以去訓(xùn)練和優(yōu)化機(jī)器學(xué)習(xí)的模型。
A data scientist still needs to be able to clean, analyze, and visualize data, just like a data analyst. However, a data scientist will have more depth and expertise in these skills, and will also be able to train and optimize machine learning models.
深度解析數(shù)據(jù)科學(xué)家
The data scientist in depth
數(shù)據(jù)科學(xué)家能產(chǎn)生巨大的價(jià)值,他們處理更多開放式的問題并且利用他們專業(yè)的統(tǒng)計(jì)學(xué)和算法知識(shí)發(fā)揮更大杠桿的作用。如果說數(shù)據(jù)分析師專注于從過去和現(xiàn)在數(shù)據(jù)層面來理解數(shù)據(jù)的話,那么數(shù)據(jù)科學(xué)家就是專注于做出對(duì)未來更可信的預(yù)測(cè)。
The data scientist is an individual that can provide immense value by tackling more open-ended questions and leveraging their knowledge of advanced statistics and algorithms. If the analyst focuses on understanding data from the past and present perspectives, then the scientist focuses on producing reliable predictions for the future.
數(shù)據(jù)科學(xué)家通過有監(jiān)督學(xué)習(xí)(分類、回歸)和無監(jiān)督學(xué)習(xí)(聚類,神經(jīng)網(wǎng)絡(luò),異常監(jiān)測(cè)?)機(jī)器學(xué)習(xí)模型來揭開隱藏著的規(guī)律。本質(zhì)上來說他們是訓(xùn)練那些能讓他們更好的識(shí)別模型和產(chǎn)出精確預(yù)測(cè)效果的數(shù)學(xué)模型的人。
The data scientist will uncover hidden insights by leveraging both supervised (e.g. classification, regression) and unsupervised learning (e.g. clustering, neural networks, anomaly detection) methods toward their machine learning models. They are essentially training mathematical models that will allow them to better identify patterns and derive accurate predictions.
下面是數(shù)據(jù)科學(xué)家完成的一些例子:
The following are examples of work performed by data scientists:
評(píng)估統(tǒng)計(jì)學(xué)模型來決定分析有效性
使用機(jī)器學(xué)習(xí)來建設(shè)更好的預(yù)測(cè)算法
測(cè)試和持續(xù)提升模型精確度
進(jìn)行數(shù)據(jù)可視化來概括分析的結(jié)論
Evaluating statistical models to determine the validity of analyses.
Using machine learning to build better predictive algorithms.
Testing and continuously improving the accuracy of machine learning models.
Building data visualizations to summarize the conclusion of an advanced analysis.
數(shù)據(jù)科學(xué)家為預(yù)測(cè)和理解數(shù)據(jù)帶來了一種完全嶄新的方式。雖然數(shù)據(jù)分析師可能也可以去描述趨勢(shì)和為商業(yè)團(tuán)隊(duì)傳遞這些結(jié)果。但是數(shù)據(jù)科學(xué)家能剔除新的問題并且可以去建模來做出對(duì)新數(shù)據(jù)的預(yù)測(cè)。
Data scientists bring an entirely new approach and perspective to understanding data. While an analyst may be able to describe trends and translate those results into business terms, the scientist will raise new questions and be able to build models to make predictions based on new data.
什么是數(shù)據(jù)工程師?
What is a data engineer?
數(shù)據(jù)工程師建設(shè)和優(yōu)化系統(tǒng)。這些系統(tǒng)幫助數(shù)據(jù)科學(xué)家和數(shù)據(jù)分析師開展他們的工作。每一個(gè)公司里面和數(shù)據(jù)打交道的人都需要依賴于這些數(shù)據(jù)是準(zhǔn)確的和可獲取的。數(shù)據(jù)工程師保證任何數(shù)據(jù)都是正常可接收的,可轉(zhuǎn)換的,可存儲(chǔ)的并且對(duì)于使用者來說是可獲取的。
Data engineers build and optimize the systems that allow data scientists and analysts to perform their work. Every company depends on its data to be accurate and accessible to individuals who need to work with it. The data engineer ensures that any data is properly received, transformed, stored, and made acessible to other users.
深度解析數(shù)據(jù)工程師
The data engineer in depth
數(shù)據(jù)工程師建立了數(shù)據(jù)分析師和數(shù)據(jù)科學(xué)家依賴的基礎(chǔ)。數(shù)據(jù)工程師對(duì)構(gòu)造數(shù)據(jù)管道并且經(jīng)常需要去使用復(fù)雜的工具和技術(shù)來管理數(shù)據(jù)負(fù)責(zé)。不想前面說的兩個(gè)事業(yè)的路徑,數(shù)據(jù)工程師更多的是朝著軟件開發(fā)能力上學(xué)習(xí)和提升。
The data engineer establishes the foundation that the data analysts and scientists build upon. Data engineers are responsible for constructing data pipelines and often have to use complex tools and techniques to handle data at scale. Unlike the previous two career paths, data engineering leans a lot more towards a software development skillset.
在比較大的組織中,數(shù)據(jù)工程師需要關(guān)注不同的方面:比如使用數(shù)據(jù)的工具,維護(hù)數(shù)據(jù)庫(kù),創(chuàng)建和管理數(shù)據(jù)管道。不管側(cè)重于什么,一個(gè)好的數(shù)據(jù)工程師能夠保證數(shù)據(jù)科學(xué)家和數(shù)據(jù)分析師專注于解決分析方面的問題,而不是一個(gè)數(shù)據(jù)源一個(gè)數(shù)據(jù)源的去移動(dòng)、操作數(shù)據(jù)。
At larger organizations, data engineers can have different focuses such as leveraging data tools, maintaining databases, and creating and managing data pipelines. Whatever the focus may be, a good data engineer allows a data scientist or analyst to focus on solving analytical problems, rather than having to move data from source to source.
數(shù)據(jù)工程師往往更加注重建設(shè)和優(yōu)化。下面的任務(wù)的示例是數(shù)據(jù)工程師通常的工作:
The data engineer’s mindset is often more focused on building and optimization. The following are examples of tasks that a data engineer might be working on:
為數(shù)據(jù)消費(fèi)開發(fā)API
在現(xiàn)存的數(shù)據(jù)管道中整合數(shù)據(jù)集
在新數(shù)據(jù)上運(yùn)用特征轉(zhuǎn)換提供給機(jī)器學(xué)習(xí)模型
持續(xù)不斷的監(jiān)控和測(cè)試系統(tǒng)保證性能優(yōu)化
Building APIs for data consumption.
Integrating external or new datasets into existing data pipelines.
Applying feature transformations for machine learning models on new data.
Continuously monitoring and testing the system to ensure optimized performance.
你的數(shù)據(jù)驅(qū)動(dòng)的事業(yè)路徑:
Your Data-Driven Career Path
現(xiàn)在你已經(jīng)了解了這三種數(shù)據(jù)驅(qū)動(dòng)的工作了,但是問題還在,你適合哪一種呢?雖然都是和數(shù)據(jù)相關(guān),但是這三種工作是截然不同的。
Now that we’ve explored these three data-driven careers, the question remains - where do you fit in? The key is to understand that these are three fundamentally different ways to work with data.
數(shù)據(jù)工程師主要工作在后端。持續(xù)的提升數(shù)據(jù)管道來保證數(shù)據(jù)的精確和可獲取。他們一般利用不同的工具來保證數(shù)據(jù)被正確的處理了,并且當(dāng)用戶要使用數(shù)據(jù)的時(shí)候保證數(shù)據(jù)是可用的。一個(gè)好的的數(shù)據(jù)工程師會(huì)為組織節(jié)省很多的時(shí)間和精力。
The data engineer is working on the “back-end,” continuously improving data pipelines to ensure that the data the organisation relies upon is accurate and available. They will leverage all sorts of different tools to ensure the data is processed correctly and that the data is available to the user when they need it. A good data engineer saves a lot of time and effort for the rest of the organization.
數(shù)據(jù)分析師一般用數(shù)據(jù)工程師提供的現(xiàn)成的接口來抽取新的數(shù)據(jù),然后取發(fā)現(xiàn)數(shù)據(jù)中的趨勢(shì)。同時(shí)也要分析異常情況。數(shù)據(jù)分析師以一種清晰的方式來概括和提出他們的結(jié)果來讓非技術(shù)的團(tuán)隊(duì)更好的理解他們現(xiàn)在在做的東西。
The data analyst may then extract a new dataset using the custom API that the engineer built and begin identifying interesting trends in that data, as well as running analyses on these anomalies. The analyst will summarize and present their results in a clear way that allows their non-technical teams to better understand where they are and how they’re doing.
最后,數(shù)據(jù)科學(xué)家更傾向于基于分析的發(fā)現(xiàn)和在更多可能性上的調(diào)查來獲得方向。不管是訓(xùn)練模型還是進(jìn)行統(tǒng)計(jì)分析,數(shù)據(jù)科學(xué)家試圖去對(duì)未來要發(fā)生的可能性提出一個(gè)更好的預(yù)測(cè)。
Finally, the data scientist will likely build upon the analyst’s initial findings and research into even more possibilities to derive insights from. Whether by training machine learning models or by running advanced statistical analyses, the data scientist is going to provide a brand new perspective into what may be possible for the near future.
不管你的特殊的路徑是什么,好奇心都是這三個(gè)職業(yè)最本質(zhì)的要求。使用數(shù)據(jù)來更好的提問和進(jìn)行精確的實(shí)驗(yàn)是數(shù)據(jù)驅(qū)動(dòng)事業(yè)的全部目標(biāo)。此外,數(shù)據(jù)科學(xué)家領(lǐng)域是不斷的進(jìn)化的,你必須要有強(qiáng)大的能力去持續(xù)不斷的學(xué)習(xí)。
Regardless of your specific path, curiosity is a natural pre-requisite of all three of these careers. The ability to use data to ask better questions and run more precise experiments is the entire purpose of a data-driven career. Furthermore, the data science field is constantly evolving and thus, there is a great need to continuously learn more.
所以,祝愿所有現(xiàn)在的和未來的數(shù)據(jù)分析師、數(shù)據(jù)科學(xué)家和數(shù)據(jù)工程師-愿你們好遠(yuǎn),并且持續(xù)不斷的學(xué)習(xí)!
So, to all the current and future data analysts, scientists, and engineers out there – good luck and keep learning!
數(shù)據(jù)分析咨詢請(qǐng)掃描二維碼
若不方便掃碼,搜微信號(hào):CDAshujufenxi
LSTM 模型輸入長(zhǎng)度選擇技巧:提升序列建模效能的關(guān)鍵? 在循環(huán)神經(jīng)網(wǎng)絡(luò)(RNN)家族中,長(zhǎng)短期記憶網(wǎng)絡(luò)(LSTM)憑借其解決長(zhǎng)序列 ...
2025-07-11CDA 數(shù)據(jù)分析師報(bào)考條件詳解與準(zhǔn)備指南? ? 在數(shù)據(jù)驅(qū)動(dòng)決策的時(shí)代浪潮下,CDA 數(shù)據(jù)分析師認(rèn)證愈發(fā)受到矚目,成為眾多有志投身數(shù) ...
2025-07-11數(shù)據(jù)透視表中兩列相乘合計(jì)的實(shí)用指南? 在數(shù)據(jù)分析的日常工作中,數(shù)據(jù)透視表憑借其強(qiáng)大的數(shù)據(jù)匯總和分析功能,成為了 Excel 用戶 ...
2025-07-11尊敬的考生: 您好! 我們誠(chéng)摯通知您,CDA Level I和 Level II考試大綱將于 2025年7月25日 實(shí)施重大更新。 此次更新旨在確保認(rèn) ...
2025-07-10BI 大數(shù)據(jù)分析師:連接數(shù)據(jù)與業(yè)務(wù)的價(jià)值轉(zhuǎn)化者? ? 在大數(shù)據(jù)與商業(yè)智能(Business Intelligence,簡(jiǎn)稱 BI)深度融合的時(shí)代,BI ...
2025-07-10SQL 在預(yù)測(cè)分析中的應(yīng)用:從數(shù)據(jù)查詢到趨勢(shì)預(yù)判? ? 在數(shù)據(jù)驅(qū)動(dòng)決策的時(shí)代,預(yù)測(cè)分析作為挖掘數(shù)據(jù)潛在價(jià)值的核心手段,正被廣泛 ...
2025-07-10數(shù)據(jù)查詢結(jié)束后:分析師的收尾工作與價(jià)值深化? ? 在數(shù)據(jù)分析的全流程中,“query end”(查詢結(jié)束)并非工作的終點(diǎn),而是將數(shù) ...
2025-07-10CDA 數(shù)據(jù)分析師考試:從報(bào)考到取證的全攻略? 在數(shù)字經(jīng)濟(jì)蓬勃發(fā)展的今天,數(shù)據(jù)分析師已成為各行業(yè)爭(zhēng)搶的核心人才,而 CDA(Certi ...
2025-07-09【CDA干貨】單樣本趨勢(shì)性檢驗(yàn):捕捉數(shù)據(jù)背后的時(shí)間軌跡? 在數(shù)據(jù)分析的版圖中,單樣本趨勢(shì)性檢驗(yàn)如同一位耐心的偵探,專注于從單 ...
2025-07-09year_month數(shù)據(jù)類型:時(shí)間維度的精準(zhǔn)切片? ? 在數(shù)據(jù)的世界里,時(shí)間是最不可或缺的維度之一,而year_month數(shù)據(jù)類型就像一把精準(zhǔn) ...
2025-07-09CDA 備考干貨:Python 在數(shù)據(jù)分析中的核心應(yīng)用與實(shí)戰(zhàn)技巧? ? 在 CDA 數(shù)據(jù)分析師認(rèn)證考試中,Python 作為數(shù)據(jù)處理與分析的核心 ...
2025-07-08SPSS 中的 Mann-Kendall 檢驗(yàn):數(shù)據(jù)趨勢(shì)與突變分析的有力工具? ? ? 在數(shù)據(jù)分析的廣袤領(lǐng)域中,準(zhǔn)確捕捉數(shù)據(jù)的趨勢(shì)變化以及識(shí)別 ...
2025-07-08備戰(zhàn) CDA 數(shù)據(jù)分析師考試:需要多久?如何規(guī)劃? CDA(Certified Data Analyst)數(shù)據(jù)分析師認(rèn)證作為國(guó)內(nèi)權(quán)威的數(shù)據(jù)分析能力認(rèn)證 ...
2025-07-08LSTM 輸出不確定的成因、影響與應(yīng)對(duì)策略? 長(zhǎng)短期記憶網(wǎng)絡(luò)(LSTM)作為循環(huán)神經(jīng)網(wǎng)絡(luò)(RNN)的一種變體,憑借獨(dú)特的門控機(jī)制,在 ...
2025-07-07統(tǒng)計(jì)學(xué)方法在市場(chǎng)調(diào)研數(shù)據(jù)中的深度應(yīng)用? 市場(chǎng)調(diào)研是企業(yè)洞察市場(chǎng)動(dòng)態(tài)、了解消費(fèi)者需求的重要途徑,而統(tǒng)計(jì)學(xué)方法則是市場(chǎng)調(diào)研數(shù) ...
2025-07-07CDA數(shù)據(jù)分析師證書考試全攻略? 在數(shù)字化浪潮席卷全球的當(dāng)下,數(shù)據(jù)已成為企業(yè)決策、行業(yè)發(fā)展的核心驅(qū)動(dòng)力,數(shù)據(jù)分析師也因此成為 ...
2025-07-07剖析 CDA 數(shù)據(jù)分析師考試題型:解鎖高效備考與答題策略? CDA(Certified Data Analyst)數(shù)據(jù)分析師考試作為衡量數(shù)據(jù)專業(yè)能力的 ...
2025-07-04SQL Server 字符串截取轉(zhuǎn)日期:解鎖數(shù)據(jù)處理的關(guān)鍵技能? 在數(shù)據(jù)處理與分析工作中,數(shù)據(jù)格式的規(guī)范性是保證后續(xù)分析準(zhǔn)確性的基礎(chǔ) ...
2025-07-04CDA 數(shù)據(jù)分析師視角:從數(shù)據(jù)迷霧中探尋商業(yè)真相? 在數(shù)字化浪潮席卷全球的今天,數(shù)據(jù)已成為企業(yè)決策的核心驅(qū)動(dòng)力,CDA(Certifie ...
2025-07-04CDA 數(shù)據(jù)分析師:開啟數(shù)據(jù)職業(yè)發(fā)展新征程? ? 在數(shù)據(jù)成為核心生產(chǎn)要素的今天,數(shù)據(jù)分析師的職業(yè)價(jià)值愈發(fā)凸顯。CDA(Certified D ...
2025-07-03