Using InsightFace + ByteTrack with a Tapo Camera
Overview
This research explores a low-cost automated staff check-in system using computer vision.
The System Combines
- Face detection & recognition: InsightFace
- Multi-object tracking: ByteTrack
- Camera hardware: TP‑Link Tapo Camera
The camera is installed far from the entrance door, simulating a realistic office setup where employees walk through naturally without interacting with a device.
The goal is to enable fully automatic attendance logging without RFID cards, fingerprint scanners, or manual interaction.
System Architecture
Camera Setup
- A Tapo IP camera is mounted at a distance from the entrance.
- Video stream is processed continuously.
- Staff walk naturally through the door.
Processing Pipeline
- Video Stream Input
- Face Detection
- Face Tracking (ByteTrack)
- Face Recognition (InsightFace embeddings)
- Identity Matching
- Attendance Logging
ByteTrack ensures the system tracks the same person across frames, avoiding repeated recognition calls and improving stability.
Evaluation Dataset
Test Dataset Characteristics
- Total samples: 72 face instances
- Multiple staff identities
- Unknown faces included to test rejection ability
- Captured from real camera footage rather than controlled photos
Overall Performance
| Metric | Score |
|---|---|
| Accuracy | 93.06% |
| Weighted F1 | 0.9297 |
| Macro F1 | 0.9313 |
| Weighted Precision | 0.9444 |
Other Stats
- No missing face detections
- No corrupted images
Per-Class Results (Names Hidden)
| Staff ID | Support | Precision | Recall | F1 |
|---|---|---|---|---|
| Staff-A | 5 | 1.00 | 1.00 | 1.00 |
| Staff-B | 15 | 1.00 | 1.00 | 1.00 |
| Staff-C | 9 | 1.00 | 0.89 | 0.94 |
| Staff-D | 10 | 1.00 | 0.80 | 0.89 |
| Staff-E | 7 | 1.00 | 1.00 | 1.00 |
| Staff-F | 6 | 1.00 | 0.67 | 0.80 |
| Unknown | 20 | 0.80 | 1.00 | 0.89 |
Error Analysis
Top Misclassifications
| Expected | Predicted | Count |
|---|---|---|
| Staff-D | Unknown | 2 |
| Staff-F | Unknown | 2 |
| Staff-C | Unknown | 1 |
Observations
- Most errors occur when a staff member is classified as unknown, not as another staff member.
- This is a safe failure mode, avoiding incorrect attendance.
Key Insights
1. Tracking Improves Recognition Stability
Using ByteTrack allows the system to:
- Recognize a person once per track
- Avoid repeated recognition per frame
- Reduce computation cost
2. Long-Distance Camera Still Works
Despite the camera being far from the door:
- Detection remained reliable
- Recognition maintained >93% accuracy
3. Unknown Detection Works Well
The system successfully rejects unfamiliar faces:
- Recall for unknown class = 100%
- Some unknown faces still match known identities; threshold tuning may improve this.
Practical Implications
System Benefits
This approach enables a fully automated attendance system with:
- No hardware interaction
- No fingerprint scanners
- No RFID cards
- Minimal installation cost
Only Requirements
- A single IP camera
- A small compute server
Future Improvements
Potential Next Steps
- Add multi-camera fusion
- Improve distance face quality filtering
- Use temporal embedding averaging
- Integrate with HR / payroll systems