Question 1

What dataset modalities are handled?

Accepted Answer

Text, image, video, audio, and multimodal datasets. Within text: named entity recognition, intent and topic classification, sentiment and toxicity, span extraction, and rewriting. Within image and video: bounding boxes, polygons, semantic segmentation, landmark points, object tracking, and event detection. Within audio: speaker diarization, sound event tagging, transcript segmentation, and prosody marks. Multimodal pairings (image–text, video–caption, screenshot–intent) are scoped per project.

Question 2

What does the request check confirm before labeling starts?

Accepted Answer

Dataset modality, language coverage, label categories with examples per class, reviewer instructions, acceptance criteria, output file format, and confidentiality controls. Sample rows are run through the rule set first and the results are checked against your expectation before bulk labeling begins. This catches rule-set ambiguity in 50 rows rather than 5,000.

Question 3

How are reviewer outputs tracked?

Accepted Answer

Per-row trace. Each labeled row carries the reviewer name (or anonymized ID), the label decision, and any notes attached during the work. When a flagged row needs a second look, the original decision is visible and the reviewer can be brought back to the question rather than starting from scratch.

Question 4

How long does a labeling pass take?

Accepted Answer

Standard turnaround for a 5,000-row text labeling pass is 5–7 working days from receipt of the rule set and a clean dataset. Image, video, and audio cadence depends on annotation type (bounding box vs polygon vs segmentation, for example) and reviewer load. Every project is quoted with a confirmed delivery date in writing rather than a vague estimate.

Question 5

How is confidentiality handled for proprietary datasets?

Accepted Answer

An NDA is signed before any file transfer when requested. Files are kept on access-restricted storage, named-reviewer handling is available for sensitive datasets, and files are deleted on a defined schedule after project close. Reviewer access scopes can be aligned with your security posture on request.

Question 6

Can rare-language or dialect-sensitive datasets be handled?

Accepted Answer

Yes. Coverage spans 250+ languages, including rare pairs where most data-labeling marketplaces cannot source qualified reviewers. For dialect-sensitive work (Levantine vs Gulf Arabic, Brazilian vs European Portuguese, Mandarin vs Cantonese, etc.) the dialect target is confirmed in the request check and reviewers are matched to it.

Question 7

Can existing rule sets be reused on new data?

Accepted Answer

Yes. Send the prior rule set, prior examples, and any glossary. The rule set is applied to your new data with consistency notes added for any edge case that the original rules do not cover. If the dataset is multilingual, the rule set is translated and adapted per language so reviewers in each language work from the same definitions.

Question 8

What output formats are supported?

Accepted Answer

JSONL, CSV, COCO, YOLO, Pascal VOC, and platform-specific exports (Label Studio, Labelbox, Scale AI, SuperAnnotate, V7, and others on request). Output format is confirmed during the request check so the labeled dataset drops into your downstream pipeline without a separate transformation step.

Scope data labeling services with rules, reviewers, and acceptance defined first.

What DD can show before a buyer commits.

How the work runs

Scope rules

Calibrate on samples

Label in batches

Trace reviewers

Confirm acceptance

What this page helps you send

What you receive

Questions teams ask first

What dataset modalities are handled?

What does the request check confirm before labeling starts?

How are reviewer outputs tracked?

How long does a labeling pass take?

How is confidentiality handled for proprietary datasets?

Can rare-language or dialect-sensitive datasets be handled?

Can existing rule sets be reused on new data?

What output formats are supported?

Get the right scope in writing.

Scope data labeling services with rules, reviewers, and acceptance defined first.

What DD can show before a buyer commits.

How the work runs

Scope rules

Calibrate on samples

Label in batches

Trace reviewers

Confirm acceptance

What this page helps you send

What you receive

Questions teams ask first

What dataset modalities are handled?

What does the request check confirm before labeling starts?

How are reviewer outputs tracked?

How long does a labeling pass take?

How is confidentiality handled for proprietary datasets?

Can rare-language or dialect-sensitive datasets be handled?

Can existing rule sets be reused on new data?

What output formats are supported?

Related solution pages

Get the right scope in writing.