Obtain the data you need to make the most informed decisions by accessing our extensive portfolio of information, analytics, and expertise. Sign in to the product or service center of your choice.
As the economic shock from the Covid-19 crisis started to play
out, and businesses tried to understand the trajectory of economies
and identify signs of recovery in specific countries and sectors,
the need for novel datasets grew. Waiting a quarter for official
data to be released was months too long and lower latency
approaches were needed. The global event data in our Data Lake
was a natural candidate for research into novel signals. This
database tracks and structures reports from news and social media
sources about politics and economics in eight languages and
enhances the information with country classifications and machine
translations.
Using this data, we wanted to identify notable events relevant
to economic crisis and recovery and contextualize the developments
that were driving the hard-economic data. In order to derive
signals from this large amount of textual information we applied
techniques from natural language processing (NLP) such as sentiment
analysis and topic modelling. This allows us to structure our
analysis of news reports published over the summer and to track the
economic recovery from COVID19 in the US.
Sentiment classification
We first filtered down to news reports that include keywords
that are associated with economic crisis or economic recovery. In
order to distinguish between negative and positive reports at a
large scale we assigned a sentiment score to each news article
using a lexical method. As opposed to data-intensive
machine-learning methods, this approach does not require previously
labelled examples (training data). It is instead based on
pre-defined lists of words (dictionaries), where a sentiment score
gets assigned to each word. By combining the scores for all words,
we can generate a sentiment score between -1 (negative) and +1
(positive) for each report, which allows us to systematically
classify news reports that provide encouraging signals for the
economy and those that suggest a gloomier state. The following
table provides examples for this method
We then
combined the signals from all news reports to create a sentiment
index. This index quantifies the economic sentiment of all news
reports that were published on a given day and assigned to the
country of interest (in the example below, the United States). Each
daily score represents the aggregate of negative and positive news
stories published on that day.
Sentiment in US economic news
The index below shows the daily movements of sentiment in news
reports about the US economy. It can be used as a quantified input
into a model to nowcast economic activity and as a tool to identify
relevant events that are driving news sentiment. Reports about the
publication of economic statistics about for example GDP growth
(June 30th), orders for big ticket goods (July 27th) or the
unemployment rate (August 7th) affect the index as much as
political statements, for example by the President (July 1st) or
Fed officials (July 14th). Other relevant events include policy
decisions and business news. This can help to notify analysts such
as buy- and sell-side researchers about important market
developments. The index can be used as a complement to lower
frequency consumer or business surveys (as for example the PMI by
IHS Markit) and official data to provide more timely information
and to understand the underlying drivers of change.
To better understand the underlying themes driving positive and
negative news articles we analyzed the topics covered in the
articles. We identified these topics using a statistical model that
assumes that the semantics of news reports are being governed by
some "latent" variables - topics - that we are not observing. These
topics shape the meaning of the news stories. Given that a news
report usually focuses on a particular topic, one would expect
certain words to appear together. We can thus identify topics based
on the co-occurrence of words in the news reports.
The two main negative topics identified for the US in June, July
and August deal with energy and labor markets. There are plenty of
news reports about falling oil prices due to a massive drop in
demand and its damaging effects on oil companies such as Exxon
Mobil and BP. This company-specific information can help traders to
systematically track market sentiment about specific businesses or
industries. In addition, a lot of reports discussed rising
unemployment during the crisis. These reports are generally
associated with negative sentiment. While the official unemployment
rate is reported with a time lag, reports about large layoffs or
assessments by business leaders can indicate changes in labor
market conditions in a more timely manner.
On the positive side, news covering the emergency loan program
that supports small businesses in the US during the crisis reflect
optimistic reporting. These articles are associated with positive
sentiment scores.
Operational deployment
Analyzing relevant topics and the associated sentiment in tandem
provides us with a structured way to digest high-frequency textual
information. While the sentiment index might be used as a
quantitative input into a model, in practice it should always be
accompanied with an analysis of the topics and events that are
driving it. These topics and events encompass important economic
signals during an economic crisis such as policy statements,
parliamentary decisions or the publication of official statistics
that might otherwise be missed.
Over the summer 2020, analysts were confronted with hundreds of
news reports about the COVID19 crisis and its economic implications
every day. Sentiment analysis and topic modelling helped to
distinguish between positive and negative reporting and to surface
relevant events. By following news clusters about oil prices and
energy markets over time analysts could track patterns and trends
in this market and identify how specific firms are affected. These
tools will thus help to understand what is happening in the
economy, evaluate change, and take more informed decisions.
Given that the event data in our Data Lake cover a wide range of
topics these methods can also be applied to monitor security risks,
operational disruptions or political developments such as election
campaigns
Posted 16 September 2020 by Dr. Marie Lechler, Data Analytics Principal, Applied Intelligence, IHS Markit and
Mateusz Mynarski, Economics & Country Risk Data Analyst, IHS Markit