One of the most important elements of a Big Data project is a rather obvious but often overlooked item: people. Without human involvement or interpretation, Big Data analytics becomes useless, having no purpose and no value. It takes a team to make Big Data work, and even if that team consists of only two individuals, it is still a necessary element.
Bringing people together to build a team can be an arduous process that involves multiple meetings, perhaps recruitment, and, of course, personnel management. Several specialized skills in Big Data are required, and that is what defines the team. Determining those skills is one of the first steps in putting a team together.
One of the first concepts to become acquainted with is the data scientist; a relatively new title, it is not readily recognized or accepted by many organizations, but it is here to stay.
A data scientist is normally associated with an employee or a business intelligence (BI) consultant who excels at analyzing data, particularly large amounts of data, to help a business gain a competitive edge. The data scientist is usually the de facto team leader during a Big Data analytics project.
The title data scientist is sometimes disparaged because it lacks specificity and can be perceived as an aggrandized synonym for data analyst. Nevertheless, the position is gaining acceptance with large enterprises that are interested in deriving meaning from Big Data, the voluminous amount of structured, unstructured, and semistructured data that a large enterprise produces or has access to.
A data scientist must possess a combination of analytic, machine learning, data mining, and statistical skills as well as experience with algorithms and coding. However, the most critical skill a data scientist should possess is the ability to translate the significance of data in a way that can be easily understood by others.
Finding and hiring talented workers with analytics skills is the first step in creating an effective data analytics team. Organizing that team is the next step; the relationship between IT and BI groups must be incorporated into the team design, leading to a determination of how much autonomy to give to Big Data analytics professionals.
Enterprises with highly organized and centralized corporate structures will lean toward placing an analytics team under an IT department or a business intelligence competency center. However, many experts have found that successful Big Data analytics projects seem to work better using a less centralized approach, giving team members the freedom to interpret results and define new ways of looking at data.
For maximum effectiveness, Big Data analytics teams can be organized by business function or placed directly within a specific business unit. An example of this would be placing an analytics team that focuses on customer churn (the turnover of customer accounts) and other marketing-related analysis in a marketing department, while a risk-focused data analytics project team would be better suited to a finance department.
Ideally, placing the Big Data analytics team into a department where the resulting data have immediate value is the best way to accelerate findings, determine value, and deliver results in an actionable fashion. That way the analyst and the departmental decision makers are speaking the same language and working in a collaborative fashion to eke out the best results.
It all depends on scale. A small business may have different analytical needs than a large business does, and that obviously affects the relationship with the data analysis professionals and the departments they work with.
A case in point would be an engineering firm that is examining large volumes of unstructured data for a technical analysis. The firm itself may be quite small, but the data set may be quite large. For example, if an engineering firm was designing a bridge, the components of Big Data analytics could involve everything from census data to traffic patterns to weather factors, which could be used to uncover load and traffic trends that would affect the design of the bridge. If other elements are added, such as market data (materials costs and anticipated financial growth for the area), the definition of a data scientist may change. That individual may need an engineering background and a keen understanding of economics and may work only with the primary engineers on the project and not with any other company departments.
This can mean that the firm’s marketing and sales departments are left out in the cold. The question then is how important is that style of analytics to those departments—arguably, it is not important at all. In a situation like that, market analysis, competition, government funding, infrastructure age and usage, and population density may not be as applicable to the in-place data scientist but may require a different individual skill set to successfully interpret the results.
As analytics needs and organizational size increase, roles may change, as well as the processes and the relationships involved. Larger organizations tend to have the resources and budgets to better leverage their data. In those cases, it becomes important to recognize the primary skills needed by a Big Data analytics team and to build the team around core competencies. Fortunately, it is relatively easy to identify those core competencies, because the tasks of the team can be broken down into three capabilities.
There are three primary capabilities needed in a data analytics team: (1) locating the data, (2) normalizing the data, and (3) analyzing the data.
For the first capability, locating the data, an individual has to be able to find relevant data from internal and external sources and work with the IT department’s data governance team to secure access to the data. That individual may also need to work with external businesses, government agencies, and research firms to gain access to large data sets, as well as understand the difference between structured and unstructured data.
For the second capability, normalizing the data, an individual has to prepare the raw data before they are analyzed to remove any spurious data. This process requires technical skills as well as analytics skills. The individual may also need to know how to combine the data sets, load those data sets on the storage platform, and build a matrix of fields to normalize the contents.
The third capability, analyzing the data, is perhaps the team’s most important chore. For most organizations, the analytic process is conducted by the data scientist, who accesses the data, designs algorithms, gathers the results, and then presents the information.
These three primary chores define a data analytics team’s functions. However, there are several subsets of tasks that fall under each category, and these tasks can vary based on scope and other elements specific to the required data analytics process.
Much like the data themselves, the team should not be static in nature and should be able to evolve and adapt to the needs of the business.
Locating the right talent to analyze data is the biggest hurdle in building a team. Such talent is in high demand, and the need for data analysts and data scientists continues to grow at an almost exponential rate.
Finding this talent means that organizations will have to focus on data science and hire statistical modelers and text data–mining professionals as well as people who specialize in sentiment analysis. Success with Big Data analytics requires solid data models, statistical predictive models, and test analytic models, since these will be the core applications needed to do Big Data.
Locating the appropriate talent takes more than just a typical IT job placement; the skills required for a good return on investment are not simple and are not solely technology oriented. Some organizations may turn to consulting firms to meet the need for talent; however, many consulting firms also have trouble finding the experts that can make Big Data pay off.
Nevertheless, there is a silver lining to the Big Data storm cloud. Big Data is about business as much as it is about technology, which means that it requires a hybrid talent. This allows the pool of potential experts to be much deeper than just the IT professional workforce. In fact, a Big Data expert could be developed from other departments that are not IT centered but that do have a significant need for research, analysis, and interpretation of facts.
The potential talent pool may grow to include any staffers who have an inherent interest in the Big Data technology platforms in play, who have a tools background from web site development work earlier in their careers, or who are just naturally curious, talented, and self-taught in a quest to be better at their jobs. These are typically individuals who can understand the value of data and the ideology of how to interpret the data.
But organizations should not hire just anyone who shows a spark of interest in or a basic understanding of data analytics. It is important to develop a litmus test of sorts to determine if an individual has the appropriate skills to succeed in what may be a new career. The candidates should possess a foundation of five critical skills to immediately bring value to a Big Data team:
These define what a data scientist should be able to accomplish.
Arguably, finding and hiring talented workers with analytics skills is the first step in establishing an advanced data analytics team. If that is indeed the case, then the second step would be determining how to structure the team in relation to existing IT and BI groups, as well as determining how much autonomy to give the analytics professionals.
That process may require building a new culture of technology professionals who also have significant business skills. Developing that culture depends on many factors, such as making sure that the teams are educated in the ways of the business culture in place and emphasizing measurements and results.
Starting at the top proves to be one of the best ways to transform an IT-centered culture into an internal business culture that thrives on advanced data analytics technology and fact-based decision making. Businesses that have experienced a change in senior management often clear the path for the development of a data analytics business culture and a data warehousing, BI, and advanced analytics program.
Instituting a change in cultural ideology is one of the most important chores associated with leveraging analytics. Many companies have become accustomed to running operations based on gut feelings and what has worked in the past, both of which lead to a formulaic way of conducting business.
Nowhere has this been more evident than in major retail chains, which usually pride themselves on consistency across locations. That cultural perspective can prove to be the antithesis of a dynamic, competitive business. Instituting a culture that uses the ideology of analytics can transform business operations. For example, the business can better serve markets by using data mining and predictive analytics tools to automatically set plans for placing inventory into individual retail locations. The key is putting the needed products in front of potential customers, such as by knowing that snow shovels will not sell in Florida and that suntan lotion sells poorly in Alaska.
Another potential way to foster an analytics business culture within an organization is to set up a dedicated data analytics group. An analytics group with its own director could develop an analytics strategy and project plan, promote the use of analytics within the company, train data analysts on analytics tools and concepts, and work with the IT, BI, and data warehousing teams on deployment projects.
Success has to be measured, and measuring a team’s contribution to the bottom line can be a difficult process. That is why it is important to build objectives, measurements, and milestones that demonstrate the benefits of a team focused on Big Data analytics. Developing performance measurements is an important part of designing a business plan. With Big Data, those metrics can be assigned to the specific goal in mind.
For example, if an organization is looking to bring efficiency to a warehouse, a performance metric may be measuring the amount of empty shelf space and what the cost of that empty shelf space means to the company. Analytics can be used to identify product movement, sales predictions, and so forth to move product into that shelf space to better service the needs of customers. It is a simple comparison of the percentage of space used before the analytics process and the percentage of space used after the analytics team has tackled the issue.