Published release
12,400
Total Images
Published across train, validation, and test partitions.
Published release
27,800
Total Annotations
Bounding-box instances carried through canonical, YOLO, and COCO exports.
Published release
3
Classes
Cyst, debris, and root remain explicit throughout conversion.
Published release
3
Published Splits
Separate partitions for fitting, tuning, and held-out reporting.
Release snapshot
Why these statistics matter
The statistics page should function as a release-quality summary, not just a placeholder dashboard. It frames the dataset around split-aware evaluation, inspection-friendly canonical coordinates, and class context that stays intact when you export back into training formats.
- Focused on SCN cyst localization and counting rather than generic agricultural detection.
- Canonical JSON keeps denormalized x1, y1, x2, y2 boxes for easier visual inspection and QA.
- The same internal dataset representation is reused to generate YOLO and COCO-oriented exports.
- Debris and root remain in the vocabulary so false positives and contextual errors can be studied directly.
Split discipline
Split Distribution
Treat the published partitions as part of the benchmark definition so training, threshold tuning, and final reporting stay separated.
Train
70%
Primary fitting split for augmentation, model learning, and batch-level experimentation.
Validation
15%
Used for threshold tuning, failure review, and regression checks during development.
Test
15%
Held out for final reporting, cross-model comparison, and publication-ready results.
Technical profile
Image & Annotation Profile
These properties affect how the dataset is inspected, converted, and reused in downstream pipelines.
Primary image formats: JPG / PNG
Annotation type: bounding-box detection
Canonical JSON stores denormalized x1, y1, x2, y2 pixel coordinates
YOLO uploads require matching images or image_manifest.json for dimension recovery
Available export targets include Canonical JSON, YOLOv5-v10, and COCO bundles
Label structure
Class Composition
The class map is intentionally compact so detection and counting experiments stay interpretable while difficult non-target context remains visible.
Each annotation marks one countable SCN cyst instance and should be treated as the core measurement signal in detection and counting workflows.
Debris captures visually confusing non-target material that can inflate false positives if the class boundary is not modeled explicitly.
Root annotations preserve biological context so models can separate cyst targets from surrounding plant material instead of learning a flattened foreground/background view.
Class reference
Class Reference
| Class |
Role |
Why it matters |
| Cyst |
Primary target |
Each row in the canonical annotations corresponds to one counted SCN cyst instance. |
| Debris |
Confusing background |
Helps document hard negatives and visually similar non-target material that can reduce precision. |
| Root |
Scene context |
Keeps plant structure visible in the label vocabulary so the dataset remains useful for robust detection analysis. |
Release checklist
Release Checklist
- Keep split reporting fixed so results remain comparable across YOLO- and COCO-based training runs.
- Use canonical JSON as the inspection layer when validating coordinate fidelity or parser behavior.
- Update only the metrics and release wording here when the archived publication bundle is finalized.
Interpretation
How to use this page
Treat this page as the public-facing summary of the release. The layout now separates benchmark statistics, technical profile, and class interpretation so final numbers can be updated in JSON without reworking the page template.