Big (data) insights

19 May 2014 John Larson, M.P.P

Big data, small data, structured data, unstructured data. It's all necessary to answer critical questions and set business strategy. But data is just a raw material. The differentiator is the power of the tools and analyst expertise that, combined, creates true competitive advantage.

Data is unquestionably the primary business intelligence tool of global business; it offers a method by which companies can analyze the market and their position within it to develop informed strategies that will help them compete profitably. Companies have always collected data, and executives have always used that data to help them make decisions.

But today managers are frequently overwhelmed with data. There is simply too much of it available to adequately process without advanced tools and methodologies. Organizations are struggling with how they can leverage this ever-increasing data flow to their advantage.

Indeed, the sheer volume of data available today enables corporate leaders to combine and analyze it in ways that produce new insights into markets, customers, and business strategies. Global corporations are pinning their hopes on the transformational opportunities that this big data can purportedly unlock.

Few would doubt that the collection and analysis of the vast quantities of data now available to companies is essential to effectively and competitively run many areas of business operations. However, producing value from such expansive and amorphous data sets can pose a serious challenge. Technology alone won't solve the problem. In fact, most failed data analysis efforts derive from one or more of these three strategic errors:

  • The wrong question is asked.
  • The wrong data is used.
  • The data is treated as part of a discrete project rather than as part of an ongoing process.

Executives must recognize-and avoid-these potential pitfalls if they hope to harness the full power of the data now available to them to help realize their business objectives.

What is 'big data' and why do companies need it?

The aforementioned strategic errors stem, at least in part, from a basic misunderstanding of the term 'big data' and what companies should do with it. The term came into vogue so quickly that many don't know what it encompasses. One familiar definition cites the 'four Vs': volume, variety, velocity, value. While descriptive, however, this alliteration doesn't address how big data should be integrated into the analysis process.

Indeed, there are valuable insights to be gained from a range of data, both big and small. The key is to utilize the right set of data that captures the complete picture and provides insights that were formerly not apparent. This includes "differentiated data" that is typically found outside one's organization. It is typically third-party or proprietary data that fills the gaps in the understanding of a trend or market.

Of course, data analysis involves more than just data. It requires the appropriate industry expertise, analytical models, and enabling tools that must be integrated into a solution that surfaces specific actions and informs key decisions. Here, perhaps, is a more useful definition that captures the role of big data in the analysis process:

True big data analysis gives context to a complex set of information; applies sophisticated analytics that transform the information in ways that answer important questions on demand; and highlights new insights yielding critical information that informs big decisions and strategy.

This "complex set of information" typically relies on a large amount of data but it can also include small bits of data, discrete information sets, high-value differentiated data, and industry knowledge drawn from a number of sources. Companies are no longer limited to just curating information in their structured databases. Data can come from many places, such as videos or social media snippets, and can be structured, as in the information found in traditional databases, unstructured or semi-structured, such as Twitter data.

As we increasingly use digital devices to conduct business, access information, and interact socially, we ourselves are becoming data agents-generating previously unfathomable volumes of information regarding our activities. And the emergence of connected devices, often referred to as the Internet of Things, will add even more usable data in the years to come. Indeed, there is a full spectrum of data, from small to large, both structured and unstructured, each type of data playing a critical role in helping answer the questions we dare to ask.

Data collection is just the first step in the process of utilizing data to help inform business strategy. Effectively analyzing that data involves guidance and knowledge from the right people (industry experts and analytical specialists) armed with the appropriate tools (platforms and analytical models) that can 'connect the dots' between seemingly unrelated phenomena. In our highly interconnected world, competitive advantage comes not only from the speed with which data can be analyzed, but also from how effectively the barriers between different types of information can be broken down to help establish the big picture and provide big insights. (See figure below.)

Big data produces new information that unleashes the power of modeling. The ability to develop models based on all the data-the entire population rather than just a sample-empowers analysts and greatly enhances their predictive ability. With a sample, analysts must make difficult assumptions that may or may not be correct. Big data offers insights into what is truly occurring because it is drawing from a complete set of actual data. That data tends to be contemporaneous-it is generated and analyzed in near real time and so reflects the state of the world now rather than several weeks or months ago. Importantly, this analysis leads to entirely new information that businesses can use to make faster, better decisions that lead to competitive advantage.

Best practices for big data

Big data analysis requires adherence to a disciplined approach to ensure the process results in clear and actionable insights. This approach can be captured with three core best practices:

1. Ask the right question, clearly

This practice may seem obvious. Nevertheless, failure can often be traced back to a bad or unclear question. The data may be correct and the analysis flawless but, if the issue under analysis is poorly defined and the query off-base, the answer suggested by the analysis may not be what the company needs to know or act on. To ask the right question, managers are well advised to employ experts with a deep understanding of the company's industry and markets-and often related industries as well. Also needed are analysts who understand the data and analytical tools in the context of specific industries and markets and who ultimately can translate data into clear insights for industry executives.

While on the surface it might seem that a simple question will result in a simple answer, the process is often more complex. For example, the Panama Canal Authority (ACP), started out wanting to know how the expansion of the canal would impact its revenues. Ultimately, a series of interdependent questions emerged that broadened the analysis. If the ACP had simply built a model to forecast revenue based on historical trends and relationships, it would have missed important nuances about dynamic shifts in global trade, some of which could significantly impact the canal's competitive position in the market. (See sidebar "Ask the right questions" below.)

2. Look beyond your own horizon

We live in an increasingly interconnected world. Markets, technologies, and industries across the globe are converging. Aerospace suppliers now compete for critical parts and equipment with suppliers to the automotive and maritime industries. Consumer preferences in the mobile media market are now shaping the development of technologies in the automotive sector. The greatest power of big data comes from its ability to integrate information from a multitude of sources, allowing organizations to see the big picture and form insights never before discernible.

But to tap this power, companies need to look beyond themselves and their immediate market. That means identifying and using data from sources outside the company and perhaps outside the industry. In a dynamic, global economy, businesses cannot rely on extrapolations of the past to predict the future. Industry experts who understand emerging trends and can help adjust a mathematical model to ensure greater forecast accuracy must be relied on.

In the case of the ACP, it brought together more than 30 experts from across many industries and disciplines, including trade and transportation, maritime, energy, chemical, automotive, and economics, to construct detailed models of trade flows by type of goods and commodities, type of vessel, and origin and destination of shipment. The models combined the canal's own data with other industries in ways they had never done before. For example, these models analyzed data on the boom in US natural gas production and uncovered new shipping routes and corresponding sipping requirements that historical trade data could not have projected. Among other things, they demonstrated that the development of capacity for 6.5 billion cubic feet/day of liquefied natural gas exports from the Gulf of Mexico to Asia would likely create shipping transits that they never expected for the Panama Canal. (See sidebar "Gather the right data-big and small" below.)

3. Big data analysis is a voyage, not a destination

In an increasingly volatile world, data changes quickly. Accordingly, data analytics needs to be an ongoing, iterative process. Economic and political environments change rapidly. Commodity prices rise and fall. Models need to be updated not only with the latest data but also the latest expert insights. Technology improvements may not only change a company's markets, requiring the updating of data, but may also impact the ability to analyze larger amounts of data. Significant ongoing investments in infrastructure may be required to guard the return on investment. Processes for regularly collecting and using the latest data should be in place.

Perhaps most importantly, data analysis should be a process of continual improvement and fine-tuning. It is imperative to learn from the past as well as to try to project into the future. Failures often offer critical insights that help make future efforts more successful. Comparing what the model predicted would happen with what actually happened provides the ability to adjust models as required. Continual refreshing of the data, models, and industry insight are critical to producing the best, most accurate projections which, in turn, provide the basis to make the best decisions.

Applying these three best practices will increase the odds that data analysis yields high-quality results-and that data analysts avoid making strategic mistakes by answering the wrong questions, getting waylaid by forces outside their own company or market, or hitting an unforeseen roadblock because of a failure to update information and fine-tune their analysis.

Big data is important but it is no panacea. It's just one type of data, albeit an important one, that is required to help companies understand the world and make informed decisions. Equally important are the years of accumulated wisdom from industry experts who ask the right questions, build the models, analyze the data, and interpret the answers to deliver the big insights.


The Panama Canal Authority's (ACP's) use of big data illustrates how applying best practices can ensure better results.

The authority originally wanted to project how the upcoming completion of the canal's expansion would change its market opportunity. The ACP already collects data on the nearly 11,000 ships that pass through the canal each year. It knows how changes in the number of ships are likely to impact revenue based on this historical data. However, a US$5 billion-plus expansion expected to be completed in 2015 will widen the canal significantly, allowing much larger ships to pass through. Historically, it could handle ships 106ft wide and 965ft long (32m x 294m). After the expansion, it will be able to handle Supermax ships measuring 160ft x 1,200ft. The number of containers each ship can carry is expected to almost triple from 4,800 to 12,500.

Analysis of this data is vital to the canal's operations and revenue generation potential. The Panama Canal is a key trade link between the Atlantic and Pacific and competes with numerous alternative water and overland routes, including the Suez Canal, transit options around Cape Horn, and other water-overland combinations. Shippers assess a variety of factors-such as bunker fuel costs, canal transit fees, port fees, and locations-when determining routing options, so it is imperative that the ACP has the right data and tools to maintain and expand its market share.

The ACP could have built a model taking the increase in ship size into account and melded it with historical trends. In fact, that was the initial approach. And the original question the ACP wanted to answer was: How will the canal's expansion impact its available market?

However, after expert analysis and discussion, the ACP realized that the picture was more complex and a much broader set of data needed to be brought into the analysis. It needed to connect the dots between developments around the world and take into account a variety of changes in global trade for goods and commodities. Quite simply, it needed to ask a new set of questions.

IHS helped the ACP build a meta-model and an interactive, inter-related framework. Experts were brought together with knowledge of trade and transportation, maritime, energy, chemicals, automotive, and economics. The model incorporated factors from various markets into the analysis and asked three questions instead of one:

How is the world's fleet evolving? This required information on both fleet size and vessel mix. The data gathered included how many ships there are by type, size, operating characteristics, and what goods they can carry. The compositional shift in fleet size and vessel mix are a function of numerous factors, including berthing options, the widening of the Panama Canal, how many ships are likely to be retired, environmental emission requirements, and how many new ships are expected to come on to the market.

How will shifts in supply and demand for commodities impact shipping? This required information on commodity flows and how shipping prices may change. There are many factors that influence commodity flows, such as relative economic performance; how the oil and gas boom in the United States impacts shipping patterns for these commodities; understanding the growing trend to co-locate production facilities close to customers; and how likely emerging industrial and manufacturing locations, such as Mexico, will become major players in the global trade markets.

How will carriers optimize shipping methods and routes to minimize costs? This required in-depth understanding of voyage costs, including data on current and future transportation costs for trade lanes by origin and destination for ports throughout the world; the cost to operate each ship (fuel, crew, time, value of money, number of transit cycles) and the cost of other transit options, such as rail or pipeline, and other routing options such as Cape Horn.

The model that was developed to answer these questions can generate transits by vessel size and global trade lanes. It can also compare alternative shipping routes-Suez Canal, Cape Horn, and Cape of Good Hope-and estimate how they stack up in terms of cost and risk to the Panama Canal.

The ACP is using the model for market planning and budget projections, including setting rates for ships passing through the canal. It now has insight, for example, into how pricing might impact customers who are considering alternative trade routes or transit methods, such as rail or pipeline.


When a company sets out to solve a business problem it must draw on the expertise and experience of many stakeholders and evaluate all possible sources of data. The first step in the process is critical.

Working with the Panama Canal Authority (ACP), IHS convened a workshop of ACP representatives from each stakeholder group to discuss the challenge of accurately forecasting traffic through the canal. The goal was very specific:

1 - Identify the data needed for accurate forecasting.
2 - Define the analytics and models needed to extract insight from the data.
3 - Develop the tools and visualizations that communicated these insights.

A workshop leader moderated the discussion, asking stakeholders for information and insight, and drafting outlines and visualizations for participants to critique. The discussion allowed stakeholders to hear others' perspectives and build consensus. The result was a significant expansion of the type of data that was needed as well as changes in tool design, structure, calculations, and final visualization of the solution. (See figure below.)

By having the group work together, new details of trade flows and canal operations came to light, resulting in a more robust model and more accurate forecasts.

CASE STUDY PANAMA CANAL AUTHORITY: Gather the right data-big and small

The forecasts produced by the models developed for the Panama Canal that predict future trade routes require a wide range of data and a need to look beyond historical relationships between GDP and commodity tonnage shipped-the traditional key drivers of trade.

The map below offers an example of how important the inclusion of new data can be to "connect the dots" to reveal a more accurate story of traffic flow through the canal. In this case, the data has captured the unconventional oil and gas revolution in the United States. The map shows the annual shipments in metric tons of refined petroleum products passing through the Panana Canal from the Gulf of Mexico to Asia and the west coast ports of South America. The addition of the unconventional oil and gas data reveals a significant shift in market share forecasts that were not evident within the GDP and commodity data. (See figure below.)

The 2009 data shows the historical values of trade without factoring in the production of unconventional oil and gas in the United States.

By 2013, as unconventional oil and gas data is incorporated into the analytical mode, the trade tonnage for refined petroleum products has shifted dramatically, increasing by 20% to Asia and 65% to South America. This rapid shift was driven largely by exports of diesel fuels, which is a direct result of the unconventional revolution in the US.

By 2017, the IHS outlook for the annual tonnage of refined petroleum products traversing the Panama Canal is forecast to increase by more than 140% compared with 2009. The rapid increase in forecast tonnage is based on data for well counts, production volumes, refinery capacity, and build-out plans to predict how production and, therefore, exports of refined petroleum products will impact tonnage traversing the canal.

The unprecedented growth of shipments to each destination illustrates the power of combining data from many sources with insight from industry experts on market trends to produce a model that provides insight beyond that which projections of historical trends alone can produce.

John Larson Vice President of Big Data Analytics, IHS
Connect with John on LinkedIn


Filter Sort