Open science, research, access, data, etc.
Data reference, annotation: Data annotation notes and tools · Knowledge representation / Semantic annotation / structured data / linked data on the web Reference: Open science, research, access, data, etc. · Citations Library related: Library glossary · Identifiers, classifiers, and other codes · Repository notes · Metadata models and standards Library systems · Online (library) search related · Library-related service notes · OpenURL notes · OCLC Pica notes · Library - unsorted |
Open Science
Open Access
The idea that distribution of scientific output should be free of charge, and of other barriers.
Open Data
In general, open data is data that is openly accessible, and usable for any purpose.
Typically licensed under an open license, because that's the only real way that last sentence can be true.
In some areas this may still be under restricted terms. This isn't really open, as it's typically just copyrighted with some specific leeway but definitely blocks of purposes.
Open data in research
In research/academia, Open Data is a little more specific, namely the idea that data collected to support research should be freely available.
To be able to verify it, for transparency, to use in further research beyond the one paper it was collected for (maybe a few more if it was useful data), etc.
Each field has their footnotes.
Since a lot of research data is expensive to collect, you may not want other labs can't profit off your expensively paid data collection.
The practice of publishing papers is relatively slow - and used to be even slower when publish/distribute meant paper and books, and was also expensive. So people were classically protective of their data, an understandably so, and we are only slowly moving away from this view.
And since publishing first meaning a lot in academia, it is typical that data is only made available after all publications are out.
See also:
Open Research
Not about open data within research, but about the research methods - making research more transparent in more of its parts, often with the point of making reproducibility and collaboration more likely.
This often amounts to sharing both the methods and data, making a point to say more than "we did a thing, we didn't forget to pay attention to X and Y, and here are our conclusions".
https://en.wikipedia.org/wiki/Open_research
Open-notebook science
Open-notebook science - basically to make the process and developments on the way (lab notebook and/or personal notebook) public, and not just the end result.
It doesn't have to be everything, nor does it have to be raw form,
in fact both might be counterproductive because they might not be coherent enough.
The idea is more to
- help science have public record of data that would otherwise not see the light of day because the result was less interesting to publish.
- help deeper understanding of the research (both laymen and scientific), by seeing intermediate results and development of theories, methods of analysis, etc.
https://en.wikipedia.org/wiki/Open-notebook_science
Principles and ideas
FAIR
FAIR (Findable, Accessible, Interoperable and Re-usable) is a (minimal, community-agreed) set of principles and practices that make data not only open in the sense that it's out there somewhere, technically, but also easy to find and use.
It could be purely about metadata, or about data as well - this depends a bit on the type of repositories we're talking about, and the kind of organization behind it.
- Note that when it describes data, this overlaps with open access details.
Findable - ideally has rich metadata, persistent identifiers, and is searchable by those
Accessible - fetchable at all, fetchable by identifier, allows auth if necessary (also, removing data doesn't mean removal of metadata)
Interoperable - uses metadata and knowledge representation that is accessible, broadly applicable and useful for analysis and processing (verify)
- somewhat vague. Roughly "is your metadata sensible for the job, and can I read it out at all"
- the last, 'are there qualified references' is vague. Does it mean links to related resources? Browsablility by subject headers? Using codes to disambiguate things confusable in text?
Re-usable - clearly disambiguated resources?, clear license, showing origin of data(verify)
- also "domain relevant community standards", which seems vague.
The basics can be considered in terms of compliance to clear use of PIDs, clear metadata and machine readability (also for indexing), clear licensing.
Some further requirements are more open-ended, as pointed out in A Dunning (2020) "Are the FAIR Data Principles Fair?" evaluates to what degree repositories currently adhere to this
See also:
- "To be findable, accessible, interoperable and reusable: language data and technology infrastructure for supporting the FAIR data approach " [1]
Relatedly
Open practices
Organisations an initiatives
COAR
IOI
DOAJ
Directory of Open Access Journals
OJS
PKP's Open Journal Systems is software that basically helps you set up an open access journal of your own.
Services and implementations
CORE
Aggregator of open access research papers from repositories and journals.
Aimed at stakeholders like researchers, the public, academic institutions, developers, funders, companies
- APIs
- Dataset download