Results Computation
Share
This section details the process whereby individual responses are edited and aggregated in order to produce the scores of each economy on each individual question of the Survey. These results, together with other indicators obtained from other sources, feed into the GCI and other research projects.4
Data editing
Prior to aggregation, the respondent-level data are subjected to a thorough editing process. A first series of tests is run to identify and exclude those surveys whose patterns of answers demonstrate a lack of sufficient focus on the part of the respondents. Surveys with a completion rate inferior to 50 percent are excluded.5 Surveys with straight answers (e.g., only 4s or only 1s) are also excluded. The very few cases of duplicate surveys—which can occur, for example, when a survey is both completed online and mailed in—are also excluded in this phase.
In a second step, a multivariate test is applied to the data using the Mahalanobis distance method. This test estimates the probability that an individual survey in a specific country “belongs” to the sample of that country by comparing the pattern of answers of that survey against the average pattern of answers in the country sample.
More specifically, the Mahalonobis distance test estimates the likelihood that one particular point of N dimensions belongs to a set of such points. One single survey made up of N answers can be viewed as the point of N dimensions, while a particular country sample c is the set of points. The Mahalanobis distance is used to compute the probability that any individual survey i does not belong to the sample c. If the probability is high enough—we use 99.9 percent as the threshold—we conclude that the survey is a clear outlier and does not “belong” to the sample. The implementation of this test requires that the number of responses in a country be greater than the number of answers, N, used in the test. The test uses 52 questions, selected by their relevance and placement in the Survey instrument.
A univariate outlier test is then applied at the country level for each question of each survey. We use the standardized score—or “z-score”—method, which indicates by how many standard deviations any one individual answer deviates from the mean of the country sample. Individual answers with a standardized score greater than 3 are dropped.
Aggregation and computation of country averages
Through 2013, the computation of country averages used a weighting by economic sector: averages of individual responses were computed for the four main economic sectors (agriculture, manufacturing industry, non-manufacturing industry, and services) in a given country. Country averages were then derived by taking a weighted average of the sector averages using the estimated contributions of each sector to a country’s GDP as weights. The aim was to obtain a more representative average.
However, while appealing in theory, this approach presents a number of implementation challenges and limitations. First, in many countries covered by the Survey, information about economic structure is not reliable or is subject to significant revision. Special treatment is also required for 10 countries for which the breakdown of industry between manufacturing and non-manufacturing is not available. Second, the structure of the sample of responses might end up differing significantly from the actual structure of the economy, despite the efforts of our Partner Institutes, especially in challenging environments where the administration of the Survey is difficult. Third, in some major petroleum- and gas-producing countries, a handful of very large companies account for a sizeable share of the non-manufacturing sector. This means that attempting to mirror the structure of the economy would result in assigning a very high individual weight to the respondent from those companies. A related issue arises if none of those companies are surveyed, in which case the non-manufacturing sector is not represented at all in the country sample. Elsewhere, where agriculture still accounts for a large share of an economy, the agriculture sector tends to be under-represented in the Survey sample because of the difficulty of identifying respondents in that sector who have an international perspective. The issue of sectoral representation tends to be exacerbated when the sample of respondents is small.
In the presence of unbalanced samples, we used to limit the maximum implicit weight of an individual response in the sample to 10 percent.6 In some extreme cases, where a sample size was too small or the sectoral representation too different from the actual structure of the economy, this mechanism was not sufficient to prevent an individual response from receiving a disproportionate weight. In such a case, the economic sector stratification average was abandoned and a simple average of the surveys was applied.
For all these reasons, this year we decided to revert back to using a simple average to compute scores of all countries. Therefore, every individual response carries the same implicit weight, regardless the company’s sector of activity. Yet, as explained above, we will continue to work with our Partner institutes to obtain samples of respondents that are as representative as possible.
Formally, the country average of a Survey indicator i for country c, denoted qi,c , is computed as follows:
where
qi,c,j is the answer to question i in country c from respondent j; and
Ni,c is the number of respondents to question i in country c.
Moving average and computation of country scores
As a final step, the country averages for 2014 are combined with the 2013 averages to produce the country scores that are used for the computation of the GCI 2014–2015 and for other projects.7
This moving average technique, introduced in 2007, consists of taking a weighted average of the most recent year’s Survey results together with a discounted average of the previous year. There are several reasons for doing this. First, it makes results less sensitive to the specific point in time when the Survey is administered. Second, it increases the amount of available information by providing a larger sample size. Additionally, because the Survey is carried out during the first quarter of the year, the average of the responses in the first quarter of 2013 and first quarter of 2014 better aligns the Survey data with many of the data indicators from sources other than the Survey, which are often year-average data.
To calculate the moving average, we use a weighting scheme composed of two overlapping elements. On one hand, we want to give each response an equal weight and, therefore, place more weight on the year with the larger sample size. At the same time, we would like to give more weight to the most recent responses because they contain more updated information. That is, we also “discount the past.” Table 2 reports the exact weights used in the computation of the scores of each country, while Box 4 details the methodology and provides a clarifying example.
Trend analysis and exceptions
The two tests described above address variability issues among individual responses in a country. Yet they were not designed to track the evolution of country scores across time. We therefore carry out an analysis to assess the reliability and consistency of the Survey data over time. As part of this analysis, we run an inter-quartile range test, or IQR test, to identify large swings—positive and negative—in the country scores. More specifically, for each country we compute the year-on-year difference, d, in the average score of a core set of 62 Survey questions. We then compute the inter-quartile range (i.e., the difference between the 25th percentile and the 75th percentile), denoted IQR, of the sample of 146 economies.8 Any value d lying outside the range bounded by the 25th percentile minus 1.5 times IQR and the 75th percentile plus 1.5 times IQR is identified as a potential outlier. Formally, we have:
where
Q1 and Q3 correspond to the 25th and 75th
percentiles of the sample, respectively, and
IQR is the difference between these two values.
This test allows for the identification of potentially problematic countries, which display large upward or downward swings or repeated and significant changes over several editions. The IQR test is complemented by a series of additional empirical tests, including an analysis of five-year trends and a comparison of changes in the Survey results with changes in other indicators capturing similar concepts. We also conduct interviews of local experts and consider the latest developments in a country in order to assess the plausibility of the Survey results.
Based on these quantitative and qualitative analyses, the 2014 Survey data collected in Bosnia and Herzegovina, Ecuador, and Rwanda deviate significantly from the historical trends, and recent developments in these countries do not seem to provide enough justification for the large swings observed. In the case of Rwanda, we use only the 2013 Survey data in the computation of the Survey scores (see the Exceptions section in Box 4). Rwanda therefore is still covered in the GCI 2014–2015. Although this remains a remedial measure, we will continue to investigate the situation over the coming months in an effort to improve the reliability of the Survey data in this country.
Last year, the same analysis resulted in the Survey data of four countries—Bosnia and Herzegovina, Jordan, Oman, and United Arab Emirates—being dismissed. This year, as an intermediate step toward the re-establishment of the standard computation method, we used a weighted average of the Survey data of 2012 and 2014 for these countries, with the exception of Bosnia and Herzegovina described further below.
In the case of Bosnia and Herzegovina, we observe a very high degree of volatility in the Survey results over the past four years. For Ecuador, the trend exhibited by the Survey results over the past four years is not corroborated by developments on the ground during that period. Therefore, as an exceptional measure, both countries are excluded from this year’s coverage. We will work closely with the respective Partner Institutes to improve the administration process and the reliability of the data, with the aim of reinstating both countries in the near future.
Conclusion
Since 1979, the World Economic Forum has been conducting a survey to gather perception data for its research on competitiveness. Over the years, the Executive Opinion Survey has become the largest poll of its kind, this year collecting the insight of more than 14,000 executives into critical drivers of their respective countries’ development. This scale could not be achieved without the tremendous efforts of the Forum’s network of over 160 Partner Institutes in carrying out the Survey at a national level. The Survey gathers valuable information on a broad range of variables for which data sources are scarce or nonexistent. For this reason, and for the integrity of our publication and related research, sampling thoroughness and comparability across the globe remain an essential and ongoing endeavor of The Global Competitiveness and Benchmarking Network.
Reference
European Management Forum. 1979. The Competitiveness of European Industry. Geneva: European Management Forum.