BugScribe - Agentic Bug Report Generation Assistant

About BugScribe

Project Name Pronunciation: /bʌɡ skraɪb/ (Bug-Scribe)

BugScribe addresses the challenges of incomplete, invalid, and inconsistent bug reports by guiding users to produce high-quality, actionable reports with minimal effort. It combines automated capture of user interactions, intelligent analysis, and interactive guidance to ensure essential details are included and ambiguities are clarified.

By evaluating the quality of each report and suggesting improvements before submission, BugScribe reduces the likelihood of invalid or duplicate reports while integrating with popular bug tracking platforms to provide developers with structured, reliable, and reproducible bug reports.

Key Features

AI-Assisted Bug Description Generation
Automated Steps-to-Reproduce (S2R)
Session Replay & Video Recording
Conversational AI Agent
Pre-Submission Invalid Bug Detection
Duplicate Issue Finding
Knowledge Base Integration
Multi-Platform Bug Tracker Integration

Speed Up Bug Reporting

Faster reports, less effort, better quality. One short description from you becomes a full, submission-ready bug report in under two minutes.

20–40× Faster than manual

5+ Fields auto-filled

1 Description from you

<2 min To submission-ready

~20 min Saved per report

Time & effort

20–40× faster than writing by hand (~20 min → under 2 min)
One description → full structured report (no filling 5+ fields manually)
Describe → review → submit; optional tweaks in natural language

What’s auto-filled

Summary, steps to reproduce, expected vs observed, environment, extra context
Validity check in seconds: valid/invalid, confidence %, explanation, route to engineering or suggested fixes
Duplicate/similar issues surfaced before filing so you can link or skip

Quality & consistency

Structured format every time (same fields and layout) for easier triage and search
Validity + confidence (e.g. 88%) so triage can prioritize without re-reading
Fix suggestions and next steps generated with the report

User journey

3 steps: describe the issue → review the draft → submit
Single place for description, logs, optional screenshot/replay → one report, one link to share

Research & Validation

BugScribe’s Pre-Submission Invalid Bug Detection is backed by systematic LLM experiments. We evaluated a validity classifier on Turkish Airlines software bug reports to compare different prompt and inference strategies. Below is a concise summary of the experiments and results.

Dataset: 50 reports (title + content), ground-truth validity (valid_bug / invalid_bug). Model: meta-llama/Llama-3.3-70B-Instruct-Turbo (Together API). No data leakage in any experiment.

Summary of Experiments

Validity classifier experiments: accuracy and metrics
Experiment	Accuracy	Correct/Total	TP	FN	TN	FP	Calls per report
Basic	64.0%	32/50	29	0	3	18	1
Enhanced	78.0%	39/50	22	7	17	4	1
Enhanced v2	82.0%	41/50	23	6	18	3	3 (majority vote)
Full 50 (iteration 1)	82.0%	41/50	26	3	15	6	1
Full 50 (iteration 2)	90.0%	45/50	29	0	16	5	1

Experiment Overview

Basic (64%): Minimal prompt (definitions only). Baseline; strong bias toward valid_bug.
Enhanced (78%): Turkish Airlines context + classification criteria; better invalid_bug detection.
Enhanced v2 (82%): Few-shot examples, rules checklist, JSON output; 3 calls per report with majority vote.
Full 50 iteration 1 (82%): No few-shot, no leakage; only context-derived rules and principles.
Full 50 iteration 2 (90%): Same as iteration 1 plus refined guidance (nuanced distinctions from context). Best accuracy with a single call per report.

Conclusion

Adding domain context, classification criteria, and refined guidance (derived from the same context document) improved accuracy from 64% to 90%. The best configuration (Full 50 iteration 2) achieves 90% accuracy with one LLM call per report and no use of evaluation data in the prompt, demonstrating a practical approach for pre-submission invalid bug detection in BugScribe.

Team Members

Emre Furkan Akyol

22103352

Mehmet Can Bıyık

22102035

Emre Dinç

22103624

Mustafa Özkan İr

22103267

Akif Emre Köşüş

22103657

Project Supervisor

Eray Tüzün

Faculty Supervisor

Project Information

Course & Academic Details

Course: CS-491 Senior Design Project 1

Academic Year: 2025-2026 Fall

University: Bilkent University

Department: Computer Engineering

Project Resources

GitHub Repository

View Source Code

Project Documentation

Validity Classifier Report

Demo & Presentation

Live Demo

Research Papers

Academic References

Reports

Detailed Design Report