12 jun. 2014

The World Bank’s policy reports in Google Scholar. Are they visible, cited, and downloaded?

Although the main goal of this work is to assess the use and impact of the World Bank’s reports, it is included in the Google Scholar’s Digest reviews because the authors not only analyse the visibility of those documents in Google Scholar but also use this database to measure the impact of these reports through their citations. In any case, we only address in this review the results that are directly associated with Google Scholar. Since the study reviewed only analyses a limited sample of certain document types (those classified as “Economic and Sector Work” or as “Technical Assistance”) and a very specific timespan (2008-2012), in the discussion section we intend to find out in which degree the reports published by the World Bank are indexed and cited in Google Scholar. To do this, the contents of the World’s Bank Knowledge Repository (OKR) are analysed finding that only the 17.1% of the 15,319 documents deposited in the OKR are indexed in Google Scholar, and 60% have received at least one citation.


§ Can we estimate the demand and use of the World Bank’s policy reports from their download and citation counts?
§  How many World Bank’s policy reports are covered by Google Scholar?
§  Are the World Bank’s policy reports cited in Google Scholar?
§  Can we identify how often (and when) policy reports were downloaded?

Unit analysis

World Bank’s policy reports: those documents within the Documents & Reports database that have been published as Economic and Sector Work, or Technical Assistance reports.
1,611 policy reports.
§  Data were gathered for all policy reports which are part of the World Bank’s Documents and Records (D&R) database.
§  Download counts were gathered using Omniture web analytics software.
§  The World Bank’s Open Knowledge Repository (OKR) was used to verify whether the policy reports were included in Google Scholar or not.
§  Number of times a PDF has been downloaded from outside the World Bank’s own website.
§  Number of times a policy report has been cited in Google Scholar.
Period analyzed:  2008-2012
Data collection date: Unknown.
1.  74.5% of the World Bank’s policy reports are indexed in Google Scholar (1,201 out of 1,611). No significant differences have been found in the range of years studied (Figure 1).
Figure 1. Percentage of  World Bank’s policy reports (Economic and Sector Work, or Technical Assistance) indexed on Google Scholar (2008-2012)
Data source: re-elaborated from Doemeland & Trevino (2014)
2.  88% of the policy reports (1,054 out of 1,201) in the sample were never cited. Of the 147 policy reports cited, 93 were cited between 1 and 5 times, and only 54 (3%) were cited more than 5 times (Table 1).

Table 1. World Bank’s policy reports (Economic and Sector Work, or Technical Assistance) cited and downloaded on Google Scholar (2008-2012)
Data source: re-elaborated from Doemeland & Trevino (2014)

3.  68% of the policy reports sample (1,093 out of 1,611) were downloaded (Table 1), although most of these relatively few times (40% were downloaded between 1 and 100 times). The policy reports that were downloaded more than 250 times compose the 13% of the sample, and only 25 policy reports (2%) receive more than 1,000 downloads during the period investigated.
4.  Citation counts are much lower than download counts (Figure 2). Only 12% received at least one citation.
Figure 2. Percentage of  World Bank’s policy reports (Economic and Sector Work, or Technical Assistance) cited and downloaded (2008-2012)
Data source: re-elaborated from Doemeland & Trevino (2014)
5.  Reports on middle-income countries with larger populations, using more expensive, complex, multi-sector, and core diagnostics, tend to be downloaded more frequently.
6.  Multi-sector reports also tend to be cited more frequently, but unlike downloads, costs are not a significant determinant of citations.
7.  The cross support provided by the World Bank’s Research Department plays an important role in increasing the demand and use of policy reports.

The most suggestive results of this work concerning our object of study (scientific knowledge about Google Scholar) are the empirical evidences provided on the wide and diverse coverage of Google’s academic search engine. They confirm something well-known: Google Scholar, unlike other traditional bibliographic databases that are mainly focused on indexing journal articles and conference proceedings, collects instances of all the types of documents produced in the scientific domain (articles, conference proceedings, books and book chapters), as well as the academic circles (doctoral theses, master’s or undergraduate theses, teaching materials) and of special interest in this work, the professional world (patents, scientific/technical reports).

In this case, the documents at hand are technical reports, and specifically, those published by the World Bank. It is demonstrated that more than 75% of the World Bank’s policy reports classified as “Economic and Sector Work” or as “Technical Assistance” between 2008 and 2012 are indexed by Google Scholar. The World Bank's Policy Research Report series brings to a broad audience the results of World Bank research on development policy. These reports are designed to contribute to the debate on appropriate public policies for developing economies (Figure 3).

Figure 3. World Bank’s policy research reports webpage
Source: http://econ.worldbank.org

The importance of this kind of documents is not only justified by the institution that publishes them (the World Bank is an authoritative economic institution), but also by the influence that its research, performed through these reports, may have in the economic policies and the economic development of the nations concerned.

Being documents written by policy makers rather than by researchers, they reflect points of view different to those to be found in strictly scientific articles. And, as the technical documents that they are, they contain abundant bibliographic references which allow measuring their professional, economic, and even social impact in a more comprehensive and complete way. As Google Scholar indexes these document types, its analysis makes possible, albeit indirectly, the measurement of other impact dimensions besides the academic one.

Nonetheless, since the study by Doemeland & Trevino analyses a limited sample of certain document types (those classified as “Economic and Sector Work”, or as “Technical Assistance”) and a very specific period of time (2008-2012), we feel compelled to ask:

Are all the reports published by the World Bank indexed in Google Scholar? 
Can we say that Google Scholar is exhaustive?

In order to answer these questions, we have analysed the contents of the World’s Bank Open Knowledge Repository (OKR), a repository launched by the World Bank in April of 2012 with the purpose of enabling free and unrestricted access to most of its research and intellectual materials (books, articles, reports and research documents). The goal of this new open access policy for the bank’s information is that all documents are freely accessible to anybody who wants to reuse them, distribute them, or produce derivative works from them, even for commercial purposes. Emphasis is placed in the fact that documents in the OKR should be easy to find by search engines.

Currently (June 2014), the OKR contains 15,319 documents (Figure 4, upper image). According to the study by Doemeland and Trevino (2014), almost 75% of the sample was indexed in Google Scholar. Therefore, would it be safe to say that 75% of the total number of documents in the OKR is indexed in Google Scholar?

If we use the “site” operator in Google Scholar, it only retrieves about 2,760 results (Figure 4; bottom image). Although Google Scholar clearly states that this operator is not intended for checking the full coverage of a specific source, such a low result leads us to think that the inclusion rate for the entire OKR does not correspond with the results provided by Doemeland and Trevino in their reports sample.

Otherwise, when downloading from GS all the documents hosted in the OKR, we have obtained a total of 2,620 documents. This means that the difference between the number of documents that GS says it finds when you make the query, and the real number of records that GS contains is negligible.

Figure 4. Documents indexed on the World Bank’s repository, and in Google Scholar

The data we have obtained in this quick inquiry tells us that only 17.1% of the documents in the OKR are indexed in Google Scholar. The number of documents per year is shown in Table 2.

Table 2. Documents from OKR indexed in Google Scholar, and their citations per year

To get a definitive idea about the type of documents that Google Scholar hasn’t indexed from this authoritative source, it is illustrative to have a look at the document types that compose the OKR: Working Papers (5,106), Economic and Sector Work (ESW) Studies (3,497), Knowledge Notes (2,599), Books (1,823), Journal articles (1,768), Annual Reports & Independent Evaluations (221), Serials (121), Technical papers (135). Whatever the reasons, these data suggest that the number of scientific/technical documents on the Web is much larger than that we may think, and larger than what Google Scholar can show us.

As regard citations, the results we have obtained show that 60% of the 2,620 documents from the OKR indexed in Google Scholar have received at least one citation (Table 2), which greatly differs from the results obtained by Doemeland & Trevino (2014), who found that only 12% of the documents in their sample had been cited at least once.

In order to further illustrate the impact of the documents published by the World Bank in the OKR, Table 3 shows the Top 25 most cited OKR documents in Google Scholar.

Table 3. Top 25 most cited documents from the OKR in Google Scholar

Since all of them have received at least 100 citations (the most cited document even surpassing 800 citations), the scientific impact of the documents contained in the OKR is undeniably significant.

As a preliminary conclusion, we found that, even though Google Scholar gathers more document types than any other database, the visibility of World Bank reports in Google Scholar is far from being complete. And this is only considering the material deposited in the official repository, not to mention the remaining material that may be allocated in other subdomains of the World Bank.

All these issues tie directly with the subject of our previous “digest” (How many academic documents are visible and freely available on the Web?) and it’ll pave the way to new working papers which will be released soon:

-       The first of them will intend to measure with greater accuracy the proportion of documents published by the World Bank (not only in the repository) that are indexed in Google Scholar, as well as how many of them are cited.
-       The second one, of a more general nature, will focus on the size of Google Scholar.

Granada & Valencia 12 june 2014

