Evidence
Evidence by Category
The findings are grouped by workflow stage. Source labels are kept explicit so every assumption is traceable.
Research and case evidence are blended to separate validated results from scenario assumptions.
Duplicate Detection
Source-backedThis directly reduces triage queue churn and duplicated investigator time.
Mozilla ~30%
Eclipse ~20%
General 20-30% duplicate range
2.8 hours/duplicate caught
Duplicate rate comparison
Validity Filtering
Source-backedFiltering invalid and non-actionable items improves triage signal quality.
up to 70% invalid in selected literature
36% invalid+duplicate combined sample window
5.14 comments on WontFix issues
Validity distribution
Automated Reproduction
Source-backedHigher-quality S2R and behavior context enables automation-driven reproduction workflows.
ReBL: 90.63% success / 74.98s
AdbGPT: 81.3% / 253.6s
BugCraft: strong reduction vs 3.41-day manual baseline
BugScribe supports better reproducibility capture
Comparison table
| Tool / Study | Success | Avg. time | Evidence note |
|---|---|---|---|
| ReBL | 90.63% | 74.98s | Published benchmark. |
| AdbGPT | 81.3% | 253.6s | Prompt-driven Android replay benchmark. |
| BugCraft | N/A | minutes (from baseline 3.41 days) | LLM-agent reproduction in Minecraft setting. |
| EBug | N/A | N/A | Research focuses on guided report quality + construction speed. |
| BugScribe | N/A | N/A | Project materials: automated structure and writing-time reduction context. |
Reproduction comparison
Clarification Overhead
Source-backedLess back-and-forth accelerates fix entry and reduces idle delays.
~2.7 rounds/report
8-12 hours per round (case context)
18-22 hrs/week clarification burden
35% sprint-time waste estimate