Sunday, 22 December 2019

What are the archiving policies of arXiv?


In this answer it is suggested that arXiv, as its name would suggest, is archival. One of Beall's criteria is that a publisher is potentially predatory if it



Has no policies or practices for digital preservation, meaning that if the journal ceases operations, all of the content disappears from the internet.



The only thing I can find about digital preservation on the arXiv website is



arXiv submissions are meant to be available in perpetuity. Thus, arXiv has high technical standards for the files that are submitted.




While it is good that the articles are in a format which will allow access in perpetuity, the primer says nothing about what happens if arXiv ceases operations. What is the arXiv policy in regards to digital preservation?



Answer



From their FAQ:



What are CUL's preservation strategies?


Digital preservation refers to a range of managed activities to support the long-term maintenance of bitstreams. These activities ensure that digital objects are usable (intact and readable), retaining all quantities of authenticity, accuracy, and functionality deemed to be essential when articles (and other associated materials) were ingested. Formats accepted by arXiv have been selected based on their archival value (TeX/LaTeX, PDF, HTML) and the ability to process all source files is actively monitored. The underlying bits are protected by standard backup procedures at the Cornell campus. Off-site backup facilities in New York City provide geographic redundancy. The complete content is replicated at arXiv's mirror sites around the world, and additional managed tape backups are taken at Los Alamos National Laboratory. CUL has an archival repository to support preservation of critical content from institutional resources, including arXiv. We anticipate storing all arXiv documents, both in source and processed form, in this repository. There will be ongoing incremental ingest of new material. We expect that CUL will bear the preservation costs for arXiv, leveraging the archival infrastructure developed for the library system.



It looks like they're relying on a) multiple offsite mirrors; b) periodic stored backups at LANL; and c) deposit in the institutional repository at Cornell.


It's a little unclear if that deposit is actually happening yet or is still part of a long-term plan, but it's worth noting that the arXiv program director is also the librarian responsible for Cornell's digital preservation work, so it's unlikely to have been forgotten about!



No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...