28 sept. 2015

The Role of Google Scholar in Evidence Reviews and Its Applicability to Grey Literature Searching

Haddaway NR, Collins AM, Coughlin D, Kirk S
The Role of Google Scholar in Evidence Reviews and Its Applicability to Grey Literature Searching.
PLoS ONE 10(9): e0138237
doi:10.1371/journal.pone.0138237


Objectives
This paper analyses the use of Google Scholar as a source of research literature to help answer the following questions:
1. What proportion of Google Scholar search results is academic literature and what proportion grey literature, and how does this vary between different topics?
2. How much overlap is there between the results obtained from Google Scholar and those obtained from Web of Science?
3. What proportion of Google Scholar and Web of Science search results are duplicates and what causes this duplication?
4. Are articles included in previous environmental systematic reviews identifiable by using Google Scholar alone?
5. Is Google Scholar an effective means of finding grey literature relative to that identified from hand searches of organisational websites?

Methods
Using systematic review case studies from environmental science (seven), this paper analyses the utility of Google Scholar in systematic reviews and in searches for grey literature.  The search strings used herein were either taken directly from the string used in Google Scholar in each systematic review’s methods or were based on the review’s academic search string where Google Scholar was not originally searched. Search results in Google Scholar were performed both at “full text” (i.e. the entire full text of each document was searched for the specified terms) and “title” (i.e. only the title of each document was searched for the specified terms) level using the advanced search facility. 
Since Google Scholar displays a maximum of 1,000 search results this was the maximum number of citations that could be analyzed.

Results
Between 8 and 39% of full text search results from Google Scholar were classed as grey literature (mean ± SD: 19% ± 11), and between 8 and 64% of title search results (40% ± 17).
-  Google Scholar's search results show a greater percentage of grey literature than academic literature in title search results (43.0%) than full text results (18.9%).
- Most of the grey literature documents were usually displayed around page 80 (±15 (SD)) for full text results, whilst it occurred at page 35 (± 25 (SD)) for title results.
- Google Scholar demonstrated modest overlap with Web of Science title searches: this overlap ranged from 10 to 67% of the total results in Web of Science
- The percentage of total results that are duplicate records for Google Scholar range from 0.56 to 2.93% and for Web of Science from 0.03 to 0.05.
- Many of the included articles from the six published systematic review case studies were identified when searching for those articles specifically in Google Scholar (94.3 to 100% of
studies). However, a significant proportion of studies in one review [31] were not found at all using Google Scholar (31.5%)
When searching specifically for individual articles, Google Scholar catalogued a larger proportion of articles than Web of Science (% of total in Google Scholar / % of total in Web of Science: SR1, 98.3/96.7; SR4, 94.3/83.9; SR6, 99.4/89.7).
None of the 84 grey literature articles identified by (systematic review 5)  were found within the exported Google Scholar search results (68 total records from title searches and 1,000 of a total 49,700 records from full text searches). However, when searched for specifically 61 of the 84 articles were identified by Google Scholar

16 sept. 2015

Improvements in Google Scholar Citations are for the summer: creating an institutional affiliation link feature

It seems that the Mountain View’s company has a special fondness for the summer to make changes to its flagship products. If last year it announced on August 21st a 'Fresh Look of Scholar Profiles", this 2015 we have learnt almost at the same time -not from the official Google Scholar blog which has not provided any information but from a Tweet by Isidro Aguillo tthat "Google Scholar Citations add links to institution`s names (incl acronyms) in correct-built affiliations of profiles".

We definitely welcome this new initiative that represents an improvement in the product since it allows having a new and easy way to search information from scholars belonging to a specific institution. Previously specific searches by the institution name or the email domain in the open box were required for this, a tedious and very unfriendly process. Now, just clicking on the name of the institution we can identify all scholars belonging to an organization as well as the global scientific interest and thematic focus of the corresponding institution. Incidentally it will facilitate the morbid – as well as dangerous - evaluative exercises that some institutions have already performed from them.

At the end, Google has implemented a new information search feature under the form of an authority control tool for institutional affiliations that lies halfway between the classic controlled search and the natural language.

Always vigilant to the changes Google introduces in its products, we have prepared a report where we explore the current implementation of this new feature. First, this new tool is described, pointing out its main characteristics and functioning. Next, the coverage and precision of the tool are evaluated. Two special cases (Google Inc. and Spanish Universities) are briefly treated with the purpose of illustrating some aspects about the accuracy of the tool for the task of gathering authors within their appropriate institution. Finally, some inconsistencies, errors and malfunctioning are identified, categorized and described. The report finishes by providing some suggestions to improve the feature. The general conclusion is that the standardized institutional affiliation link provided by Google Scholar Citations, despite working pretty well for a large number of institutions (especially Anglo-Saxon universities) still has a number of shortcomings and pitfalls which need to be addressed in order to make this authority control tool fully useful worldwide, both for searching purposes and for metric tasks

9 sept. 2015

Developing an Open-Source Bibliometric Ranking Website Using Google Scholar Citation Profiles for Researchers in the Field of Biomedical informatics

Dean F. Sittig, Allison B. McCoy, Adam WrightJimmy Lin
Developing an open-source bibliometric ranking website using Google Scholar Citation Profiles for researchers in the field of Biomedical informatics 
Sarkar et al. (Eds.). MEDINFO 2015: eHealth-enabled Health.  MIA and IOS Press,2015, 
DOI 10.3233/978-1-61499-564-7-1004


Objectives
The principal objective of this work is to develop a searchable, interactive, automatically updating, open source, bibliometric ranking website using Google Scholar Citation Profiles that includes over 1,170 Biomedical Informatics researchers from around the world: the Biomedical Informatics Researchers ranking website (rank.informatics-review.com). 
This list contains only researchers who have a Google Scholar Profile.
Methods
The website is composed of four key components that work together to create an automatically updating ranking website: 
(1) list of biomedical informatics researchers
(2) Google Scholar scraper
(3) display page
(4) updater
This open-source application is written in Node.js® and built using commonly-available open source libraries. It takes as input the list of researchers and then iteratively retrieves the listing of each person’s Google Scholar citation counts, the total number of citations, the year of first citation, the i10-index, and the h-index. These values are extracted based on matching the relevant elements from each page’s DOM (Document Object Model) structure.
In addition to extracting raw statistics from profile pages, the application also calculates the citations/year, i-10 index/year, and h-index/year; all computed values are written into a file in JSON format, which faciliates the display as well asdownstream processing by other applications.
Results
The correlation coefficient between the h-index and total citations (r2=0.8) and i10-index (r2=0.93)
The Biomedical Informatics Researchers ranking website is