All 3 entries tagged Gscholar
No other Warwick Blogs use the tag Gscholar on entries | View entries tagged Gscholar at Technorati | There are no images tagged Gscholar on this blog
August 22, 2011
Writing about web page http://dx.doi.org/10.3998/3336451.0013.305
I've linked to this article: J. Beel and B Gipp, 'Academic Search Engine Spam and Google Scholar's Resilience Against it', Journal of Electronic Publishing, 13 (3), 2010
The article discusses possibilities for academic search engine optimisation, and what happens when this becomes spamming activity. It has a very neat description of how people go about spamming search engines, and it considers some of the ways that scholars can manipulate academic search engines.
If (like me) you aren't already aware of all these nefarious techniques, then the article will be an eye-opener! The researchers experimented on Google Scholar, using the following approaches:
- "When creating an article, an author might place invisible text in it. This way, the article later might appear more relevant for certain keyword searches than it actually is."
- "A researcher could modify his own or someone else’s article and upload it to the Web. Modifications could include the addition of additional references, keywords, or advertisements."
- "A manipulating researcher could create complete fake papers that cite his or her own articles, to increase rankings, reputation, and visibility."
I find it interesting that three of the sites they used in their manipulation were Mendeley, Academia.edu and ResearchGate. I've blogged about these sites and their ilk before and I've suggested to researchers that having details about their work on these sites would help them raise their profiles on the Web. The article says that only the papers uploaded to academia.edu were crawled and indexed by Google Scholar... which is kind of good news for the robustness of Google Scholar. And also an indication that for those searching for full text versions of articles on the web that they should go directly to the kinds of sites which might hold them (I do recommend Mendeley), and not only rely on search engines.
After reading this article, I want to know how to go about modifying a journal article after it has been published (including those not your own!), in order to add references. The authors didn't go into detail about how to do that, but you can imagine the havoc it would play with Google Scholar's citation scores if we were all doing it!
I note that the authors described the journal 'Epidemiology' as "a reputable journal by the publisher JSTOR". JSTOR is not a publisher, it is a content aggregator. 'Epidemiology' is published by Wolters Kluwer. It probably takes a librarian to know this, and I wonder whether it is relevant anyway. It could be a deliberate faux pas on the part of the authors, because it kind of illustrates their point that people don't know where content online is coming from! And the authors are right that a journal available on JSTOR is a reputable academic title.
The discussion section of the paper describes that it is a lot of effort to spam academic search engines, that the benefit is not immediate or measurable for academics, and that academics are unlikely to undertake such work because their reputation is so valuable and could be permanently damaged if a search engine were to ban all his/her articles once spamming activity was discovered. The authors raise the matter of whether a journal or conference might engage in search engine spamming: they don't mention academic institutions, but I believe that Universities could also have a motivation.
I do worry about where we draw the line between authors or journals raising their profile in legitimage ways, and where spamming begins. I have long advised authors to include key words in their article titles because of the way journal indexing tools work in ranking results, and this seems to me to make good sense from both a "discovery optimisation" point of view and from an academic accuracy perspective. I also believe that self-citation is a good idea, in that I think false modesty is pointless and potentially damaging, but authors ought to know whether their earlier work is relevant to their latest article or not, and how well such practice is accepted in their own field and therefore be able to self-cite with caution.
The big question about all such profile raising practices, for me, is how far should we go? This article doesn't give the answer, but it describes an awful lot more about what could be done. They also conclude by suggesting: "the academic community needs to decide what actions are appropriate and when academic search engine optimization ends and academic search engine spam begins."
July 04, 2011
Writing about web page https://addons.mozilla.org/en-US/firefox/addon/scholar-h-index-calculator/
Last week I found out about a Mozilla Firefox extension which I've linked to from this post. It looks very useful in that it calculates the h-index and various other index scores for the results of any search you perform on Google Scholar, once you've installed it. If you're an author wanting to know your own h-index then the trick is to get your results set to include all of your own works. The advanced analysis feature of the extension allows you to un-tick certain results from the calculations presented in the panel at the top of your results set.
Only 100 results are processed in the analysis, so it isn't going to be a great tool for those with hundreds of publications to their name.
The tool presents not only the h-index but also the g-index, which gives extra weighting to citations from papers which are highly cited themselves, and an e-index which counts "excess citations". You can read more about the e-index on PLoS One article published in 2009 at: http://dx.doi.org/10.1371/journal.pone.0005429
It also presents a “delta-h” and "delta-g" score which looks really useful for authors who want to know how close they are to raising their index scores.
July 08, 2010
Writing about web page http://www.timeshighereducation.co.uk/story.asp?storycode=412341
I was just looking at Uni ranking methodologies after reading the cover article in today's THE. THE will be using TR data and said they were going to normalise citation data by subject as part of their new methodology, which sounds interesting since I was asked at a departmental visit today, about how they can tell whether their "scores" are good or normal for their disicpline. I'm not sure how to look up the average H index or article citations numer for a particular discipline.
Shanghai's World ranking uses TR data as well, and Webometrics uses GScholar. QS who were the providers of THE's ranking have used Scopus data since 2007.