Monday 8 August 2016

publications - How to automatically extract submitted/accepted dates of many journal papers?


I am doing a research of the processing times of papers published in journals in my field. I have noticed that the metrics that the journals advertise (e.g. the Elsevier journal insights) do not correspond to my experience, nor to the recently published papers, so I wanted to make my own survey. (My guess is that they take into account papers which are immediately rejected by the editor without being sent to a review, so the average looks quite favourable. I am more interested in the average time of the papers which are actually accepted.)


I plan to cover all recently (last 12 months) published papers in 10-20 journals of different publishers (e.g. Elsevier, T&F, Wiley), which will result in hundreds of papers. Basically, I will take the date when the paper was submitted, accepted, and published online, and calculate the average per journal.


Is there a way to automatically extract this information?



Answer



Have you checked this data is actually made available for your preferred journals? IME not all make their accepted/submitted/first-online dates very easily accessible, though it has improved a bit recently.


If it's there, your best bet is probably to screenscrape the HTML. Some journals provide nice clean XML to play with, but this is usually new online-only titles rather than legacy ones from traditional publishers.



Elsevier use a simple HTML tag (class="articleDates") which contains the core dates -


Received 23 March 2015, Revised 15 May 2015, Accepted 18 May 2015, Available online 9 June 2015


Taylor & Francis have similar information to Elsevier: the element you'd need is again "articleDates", but it unfortunately has a lot of linebreaks in it for no good reason!


Finally, Wiley don't seem to expose submitted/accepted dates (at least not for all journals); "publicationHistoryDetails" just gives first-online, which isn't much help.


No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...