Filter Results
8 results
- Data for: Eco-efficiency assessment of the electricity sector: evidence from 28 European Union countriesData for: Eco-efficiency assessment of the electricity sector: evidence from 28 European Union countries
- Dataset
- Data for: Tax Cuts and “Middle-Class” WorkersData related to the paper.
- Dataset
- Data for: Performance Impacts of Structure and Volition in Implementing Policy through IT-Enabled Government-Citizen and Government-Employee InteractionsCrisp-set QCA data of US states in 2004 and 2005 for internal and external IT-enabled interaction types.
- Dataset
- Data for: A note on determinants of Japanese Foreign Direct Investment in Southeast Asia, 2008–2015The FDI-related data are obtained from Japan’s Ministry of Economy, Trade and Industry, the nominal exchange rate, consumer price index, and real GDP data from the International Monetary Fund. The share of Japanese value added by industry is obtained from the Cabinet Office, Government of Japan, and the GDP per capita data from the World Bank. In addition, Thai industry-level GDP growth or industry-level FDI inflow from abroad (per GDP) are obtained from the Bank of Thailand.
- Dataset
- Data for: Oil Price Dynamics and Sectoral Employment in the U.S.Time Series data at the county level for oil prices and employment.
- Dataset
- Data for: Effects of migration and business networks on intellectual property trade: Evidence from Japanintellectual property trade, migration, and foreign direct investments
- Dataset
- Data for: Distributional effects of environmental taxation: an approximation with a meta-regression analysisThe first sheet consists of the table summary of the studies. The second sheet contains the same info than the first, but prepared for coding. The third sheet corresponds to the dataset used for the regressions. The fourth sheet details the codes used for the indicator variables.
- Dataset
- Replication data for "Economic uncertainty and natural language processing; The case of Russia"The paper proposes a method of constructing text-based country-specific measures for economic policy uncertainty. To avoid problems of translation and human validation costs, we apply natural language processing and sentiment analysis to construct such measures for Russia. We compare our measure with that developed earlier using direct translations from English and human validation. In this comparison, our measure does equally well at evaluating the uncertainty related to key events that affected Russia between 1994 and 2018 and performs better at detecting the effects of uncertainty in Russia’s industrial production.
Data used to construct uncertainty indexes
We have constructed the EPU using data from four daily newspapers available electronically, which are :
1. Kommersant (Oct 1992 – Feb 2018), 579 997 articles
2. Moskovskiy Komsomolets (Jan 2005 – Feb 2018), 143 758 articles
3. Novaya Gazeta (Feb 2004 – Feb 2018), 63 884 articles
4. Vedomosti (Dec 2003 – Feb 2018), 342 309 articles
These newspapers represent a good spectrum of the newspapers aimed at different categories of readers. Kommersant is a daily of broad circulation that is primarily but loosely associated with information and news on business and commerce for a wide group of readers. According to https://www.kommersant.ru/about/kommersant, 23 January 2020, its daily circulation is around 100,000 — 110,000 copies. Moskovskiy Komsomolets is a popular newspaper addressed at a general audience with a print circulation of around 700,000 copies, according to https://ria.ru/20091211/198562973.html. Vedomosti is a business daily aimed at students and professionals, with quite limited circulation. According to the Russian Wikipedia page https://ru.wikipedia.org/wiki/ведомости, its daily circulation is 75,000 copies. Novaya Gazeta is regarded as relatively independent and sometimes critical towards the Russian government. It is not a proper daily, as it has been published in 2019 three times a week. Its reported circulation in August 2009 was 104,700 (https://web.archive.org/web/20090822153334/http://www.pressaudit.ru/registry ). There are four csv files, one for each newspaper, named *-sent2.csv with the following data:
- date
- article's number of words in economy category
- number of words in policy category
- number of words in uncertainty category
- document id
- number of the LDA topic (15 latent topics)
- name of the LDA topic (15 latent topics)
- number of the LDA topic (30 topics, 20 for Kommersant)
- name of the LDA topic (30 topics, 20 for Kommersant)
- *20/50 - article's number of words in word2vec dictionary in categories uncertainty, policy and economy, 20 or 50 words with smallest cosine distance
- pos/neg/sent - percentage of words with positive/negative inclination and sent=pos-neg.
- 1 for standard sentiment lexicons, 2 for Covid-augmented lexicons
Uncertainty indexes and macroeconomic data
Data description
File U_data: data for different uncertainty indices
Symbols are as in the Appendix in the paper:
Pairs of uncertainty indices symbols of columns
U computed for all newspapers U
U computed for Kommersant only U(Kom.)
U under homogeneity of journalistic style U(Hom.)
U under heterogeneity of journalistic style U(Het.)
U computed with the use of Loukash. lexicons U(Louk.)
U computed with the use of Kaggle lexicons U(Kag.)
U weighed by negative sentiments only U-
Other files with micro data are stored in files named by the following convention:
RU_s_LDA_VINTAGE_LEXICON
where integers s, LDA,VIVTAGE and LEXICON describes the different ways of computing sentimentso, topic modelling, vintage of data and sentiment lexicons applied. The files contain monthly data, mainly the frequencies of the appearance of the articles selected by different methods and weighted by different sentiment indicators.
In detail:
Excel_data_recomp_s, where s=0,..,6, and:
s=0: indices are weighted by crude sentiment frequencies.
s=1: indices are weighted by 1+- crude sentiment frequencies.
s= 2: as for s=1, but the sentiments are rescaled.
s=3 as for s=1, but sentiments are values of exponential distribution .
s=4 Valance is used as measures of sentiments;
see Ferrara E, Yang Z (2015) ‘Measuring Emotional Contagion in Social Media’. PLoS ONE 10 e0142390. doi:10.1371/journal.pone.0142390.
Valance is computed from the sentiment ratios, that is, as if s=0. It is, of course, possible to combine valence with switch_sent 1, 2 and 3 .
if s=5 and s=6, weights are classified, according to the SentiStrength methodology, where the classes are set according to the quantiles of the frequency of sentiments. There are 4 quantile points used for dividing the sentiments into classes: 0.15; 0.5; 0.75;0.9 .
s=5 classes are set according to the quantiles computed for all journals (assumption of the homogeneity of readers' perception).
s=6 quantiles are computed separately for each journal and lexicon (assumption of heterogeneity of readers' perception).
In each directory, there are 10 files with data. The convention of naming the files is the following:
RU_LDA_VINTAGE_LEXICON
where
if LDA=0 U is computed using data from all articles in the newspaper.
if LDA=1 U is computed using data from ‘relevant’ articles, where the ‘relevance’ is decided by the 15-topic LDA.
if LDA=2 U is computed using data from ‘relevant’ articles, where the ‘relevance’ is decided by the 30-topic LDA.
if VINTAGE=0: Reduced number of words in descriptors is used.
if VINTAGE=1: Extended number of words in descriptors is used.
if LEXICON=0 Loukashevich sentiment lexicon is used.
if LEXICON=1: Kaggle lexicon is used.
In each file, there are 18 sheets containing the following:
Sheet 1: U: monthly EPU frequencies computed using articles with non-stemmed descriptors.
Sheet 2: U1: monthly EPU frequencies computed using articles with -stemmed descriptors.
Sheet 3: U20: monthly EPU frequencies computed using Word2vec 20-words descriptors.
Sheet 4: U50: monthly EPU frequencies computed using Word2vec 50-words descriptors.
Sheet 5: U+ monthly EPU frequencies weighted by positive sentiments using all articles with non-stemmed descriptors.
Sheet 6: U1+ monthly EPU frequencies weighted by positive sentiments using all articles with -stemmed descriptors.
Sheet 7: U20+ monthly EPU frequencies weighted by positive sentiments using Word2vec 20-words descriptors.
Sheet 8: U50- monthly EPU frequencies weighted by negative sentiments using Word2vec 50-words descriptors.
Sheet 9: U- monthly EPU frequencies weighted by negative sentiments using all articles with non-stemmed descriptors.
Sheet 10: U1- monthly EPU frequencies weighted by negative sentiments using all articles with -stemmed descriptors.
Sheet 11: U20- monthly EPU frequencies weighted by negative sentiments using Word2vec 20-words descriptors.
Sheet 12: U50- monthly EPU frequencies weighted by negative sentiments using Word2vec 50-words descriptors.
Sheet 13: U+- monthly EPU frequencies weighted by the balance of sentiments using all articles with -stemmed descriptors.
Sheet 14: U1+-- monthly EPU frequencies weighted by the balance of sentiments using stemmed descriptors.
Sheet 15: U20+- monthly EPU frequencies weighted by the balance of sentiments using Word2vec 20-words descriptors.
Sheet 16: U20+- monthly EPU frequencies weighted by the balance of sentiments using Word2vec 50-words descriptors.
Sheet 17: Sentiment scores: positive, negative and balanced (difference between positive and negative scores) for each article.
Sheet 18: Total number of articles considered for each month.
Except for sheets 17 and 18, columns C to F contain monthly frequencies for the newspapers Kommersant, Vedomosti, Moskovskiy Komsomolets and Novaya Gazeta, respectively. Columns containing zeros should be ignored- Dataset