Dealing with user-contributed LOD: issues, opportunities and applications

Posted by

Empowering users–other researchers, citizen scholars, or the “crowd”–to annotate or create data is a low-cost way to expand datasets. There are countless applications for this ranging from transcription, translation, and metadata and OCR correction to the generation of a wide array of user-contributed content.

crowsourcing vs open source

image by:

There are quite disparate ways of engaging users in different types of content creation, however, and a number of concerns exist:

Quality control: Are there effective workflows for evaluating user-contributed content? Can we crowd-source quality control? How do we even keep spam out of our crowdsourcing tools?

Ethics: what are the ethical considerations associated with sourcing free labour in this way? What kind of disclosure regarding data use and reuse is required? What kind of credit is appropriate? Are there risks related to exposure and recirculation of data beyond the context of contribution?

Management: What are examples of effective interfaces? What workflows are required? How should provenance be tracked and made evident?

I’m hoping we can talk about the practicalities of this: use cases; examples of projects engaging citizen scholars or the public in LOD data production; and tools and interfaces for managing the process.

Some sites to consider:

Transcribe Bentham

Old Weather

Linked Jazz 52nd Street



What others are relevant? What are the best models for certain kinds of user contributions?

Skip to toolbar