28 ene. 2016

A new Almetrics: Bibliometricians in Google Scholar Citations, ResearcherID, ResearchGate, Mendeley, and Twitter. The counting house: measuring those who count

Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place. A set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among scientists in the digital space, making new aspects of scientific communication emerge. 

In light of these new developments, we announce the publication of a new working paper: “The counting house, measuring those who count: Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in GSC (Google Scholar Citations), ResearcherID, ResearchGate, Mendeley, & Twitter”. You may access it from the following link: 

This study presents, for the first time, a comparison of 31 author-level bibliometric indicators: number of publications, number of citations, usage (views and downloads of bibliographic records and documents), social media presence (tweets) and connectivity (followers/following). These indicators were extracted from various platforms such as GSC (Google Scholar Citations), ResearcherID, ResearchGate, Mendeley, and Twitter. Each of these platforms was designed for a different purpose, and they use different data sources.

Our goal is to learn what each of these indicators can tell us about the researchers. We studied their correlation to learn what they measure and the extent to which each of the indicators is similar or different to the rest. We compared authors, not articles, like it is usually done. With this we open the door to a new ALMetrics: Authors Level Metrics, an assessment of the performance of authors in the scientific Web, whether it be thorough the information found on bibliographic databases, academic search engines, specialized or institutional repositories, or social media.
Our sample consists of 814 researchers for whom Bibliometrics is either their main research interest (core authors) or a secondary research interest (related authors). The list of researchers and their indicators can be browsed at http://www.scholar-mirrors.infoec3.es.

Other goals of this study are:
- Comparing the user metric portraits generated by Google Scholar Citations to those offered by new platforms for the management of personal bibliographic profiles (ResearcherID, ResearchGate, and Mendeley) and content dissemination and communication (Twitter).
- Testing the completeness, reliability and validity of the information provided by Google Scholar Citations (to generate disciplinary rankings), and by the remaining social platforms (to generate complementary academic mirrors of the scientific community).

Among the results, the most important are the following:
GSC is the most used platform, followed at a distance by ResearchGate (543 authors), which is currently growing at a vertiginous speed. The number of Mendeley profiles is high, although this data by itself is misleading, since 17.1% of the profiles are basically empty.  ResearcherID is also affected by this issue (34.45% of the profiles are empty); as is Twitter (47% of the 240 authors with a Twitter profile have published less than 100 tweets).

- All metrics provided by Google Scholar correlate strongly among themselves. This is the confirmation of a fact we already knew. 
- The indicators offered by Google Scholar Citations (especially the h-index and the h5-index) is also the platform that achieves the highest correlation to the indicators in other platforms, whether they be production, citation, or usage indicators. There is one exception: GS indicators don't correlate well with those related to online social networking (most twitter indicators, and the followers and following indicators from ResearchGate and Mendeley).
- Regarding ResearchGate, we find a separation between the usage (views and downloads) and citation metrics (Citations, Impact Points). Indicators from ResearchGate achieve moderate to high correlations among themselves, except for the networking indicators (followers), which don’t correlate well with the rest. The RG Score, a proprietary indicator for which no method of calculation has been disclosed, displays a good correlation to the rest of citation-based indicators, especially to the ones available in GSC.
- The RG Score, a proprietary indicator for which no method of calculation has been disclosed, displays a good correlation to the rest of citation-based indicators, especially to the ones available in GSC.
- There is a high correlation between the Mendeley “Readers” indicator and the total number of publications (0.83), which seems logical because the number of readers is largely restricted by the number of papers an author includes in his/her profile. Additionally, the Mendeley “Readers” indicator correlates to the RG Score, and strongly correlates to Google’s total citations, h-index.
- Indicators from ResearcherID strongly correlate among themselves, but are slightly separated from other citation metrics (those from Google Scholar and ResearchGate). This issue can probably be explained by the low regularity with which ResearcherID profiles are updated.
The online presence indicators obtained from Twitter moderately correlate among themselves, but they don't correlate well with indicators from other platforms. Clearly, they measure a different aspect of the social activities of scientists.
- The pair of indicators followers/following seem to correlate. The highest correlation is found in Mendeley (0.96), followed at a distance by Twitter (0.81), and lastly ResearchGate (0.70). The latin expression «Do ut des» ("I give so that you will give") comes to mind. I follow you and you follow me... that is, reciprocal relationships.

To summarize, we found two kinds of impact on the Web: first, all metrics related to academic impact, and second, all metrics associated with connectivity and popularity (followers). The first group can further be divided into usage metrics (views and downloads) and citation metrics. The correlation among them is very high, especially between all metrics from Google Scholar, the RG Score (ResearchGate), and the Readers indicator in Mendeley.

Lastly, we prove that it is feasible to depict an accurate representation of the current state of the Bibliometrics community using data from Google Scholar Citations (the most influential authors, documents, journals, and publishers). This model could be extended to any discipline, scientific or professional community, or to any institution (universities, research centers, companies).

However, and given that these platforms and their indicators have already been used by many institutions to assess scientists, we want to stress the dangers of blindly using any of these platforms for the assessment of individuals, without verifying the veracity and exhaustiveness of the data. In this line, as a reminder of the reliability problems that afflict these platforms, we offer a taxonomy of all the errors that may affect the reliability of the data contained in each of these platforms, with a special emphasis in GSC, since it has been our main source of data.