feat: add complete API specification

- 10 prioritized user stories (P1-P3)
- 30 functional requirements
- 15 success criteria with measurable outcomes
- Complete edge cases and risk analysis
- Technology-agnostic specification ready for planning phase
This commit is contained in:
Geutebruck API Developer
2025-11-13 03:16:03 -08:00
parent 25c29b49ba
commit 44dc06e7f1

View File

@@ -0,0 +1,360 @@
# Feature Specification: Geutebruck Video Surveillance API
**Feature Branch**: `001-surveillance-api`
**Created**: 2025-11-13
**Status**: Draft
**Input**: User description: "Complete RESTful API for Geutebruck GeViScope/GeViSoft video surveillance system control"
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Secure API Access (Priority: P1)
As a developer integrating a custom surveillance application, I need to authenticate to the API securely so that only authorized users can access camera feeds and control functions.
**Why this priority**: Without authentication, the entire system is insecure and unusable. This is the foundation for all other features and must be implemented first.
**Independent Test**: Can be fully tested by attempting to access protected endpoints without credentials (should fail), then with valid JWT tokens (should succeed), and delivers a working authentication system that all other features depend on.
**Acceptance Scenarios**:
1. **Given** a developer with valid credentials, **When** they request a JWT token from `/api/v1/auth/login`, **Then** they receive a token valid for 1 hour with appropriate user claims
2. **Given** an expired JWT token, **When** they attempt to access a protected endpoint, **Then** they receive a 401 Unauthorized response with clear error message
3. **Given** a valid refresh token, **When** they request a new access token, **Then** they receive a fresh JWT token without re-authenticating
4. **Given** invalid credentials, **When** they attempt to login, **Then** they receive a 401 response and the failed attempt is logged for security monitoring
---
### User Story 2 - Live Video Stream Access (Priority: P1)
As a security operator, I need to view live video streams from surveillance cameras through the API so that I can monitor locations in real-time from a custom dashboard.
**Why this priority**: Live video viewing is the core function of surveillance systems. Without this, the system cannot fulfill its primary purpose.
**Independent Test**: Can be fully tested by requesting stream URLs for configured cameras and verifying that video playback works, delivering immediate value as a basic surveillance viewer.
**Acceptance Scenarios**:
1. **Given** an authenticated user with camera view permissions, **When** they request a live stream for camera channel 5, **Then** they receive a stream URL or WebSocket connection that delivers live video within 2 seconds
2. **Given** a camera that is offline, **When** a user requests its stream, **Then** they receive a clear error message indicating the camera is unavailable
3. **Given** multiple concurrent users, **When** they request the same camera stream, **Then** all users can view the stream simultaneously without degradation (up to 100 concurrent streams)
4. **Given** a user without permission for a specific camera, **When** they request its stream, **Then** they receive a 403 Forbidden response
---
### User Story 3 - Camera PTZ Control (Priority: P1)
As a security operator, I need to control pan-tilt-zoom cameras remotely via the API so that I can adjust camera angles to investigate incidents or track movement.
**Why this priority**: PTZ control is essential for active surveillance operations and incident response, making it critical for operational use.
**Independent Test**: Can be fully tested by sending PTZ commands (pan left/right, tilt up/down, zoom in/out) to a PTZ-capable camera and verifying movement occurs, delivering functional camera control capabilities.
**Acceptance Scenarios**:
1. **Given** an authenticated operator with PTZ permissions, **When** they send a pan-left command to camera 3, **Then** the camera begins moving left within 500ms and they receive confirmation
2. **Given** a camera that doesn't support PTZ, **When** a user attempts PTZ control, **Then** they receive a clear error indicating PTZ is not available for this camera
3. **Given** two operators controlling the same PTZ camera, **When** they send conflicting commands simultaneously, **Then** the system queues commands and notifies operators of the conflict
4. **Given** a PTZ command in progress, **When** the user sends a stop command, **Then** the camera movement stops immediately
---
### User Story 4 - Real-time Event Notifications (Priority: P1)
As a security operator, I need to receive instant notifications when surveillance events occur (motion detection, alarms, sensor triggers) so that I can respond quickly to security incidents.
**Why this priority**: Real-time alerts are critical for security effectiveness. Without event notifications, operators must constantly monitor all cameras manually.
**Independent Test**: Can be fully tested by subscribing to event notifications via WebSocket, triggering a test alarm, and verifying notification delivery within 100ms, providing functional event monitoring.
**Acceptance Scenarios**:
1. **Given** an authenticated user with event subscription permissions, **When** they connect to the WebSocket endpoint `/api/v1/events/stream`, **Then** they receive a connection confirmation and can subscribe to specific event types
2. **Given** a motion detection event occurs on camera 7, **When** a subscribed user is listening for video analytics events, **Then** they receive a notification within 100ms containing event type, camera channel, timestamp, and relevant data
3. **Given** a network disconnection, **When** the WebSocket reconnects, **Then** the user automatically re-subscribes to their previous event types and receives any missed critical events
4. **Given** 1000+ concurrent WebSocket connections, **When** an event occurs, **Then** all subscribed users receive notifications without system degradation
---
### User Story 5 - Recording Management (Priority: P2)
As a security administrator, I need to manage video recording settings and query recorded footage so that I can configure retention policies and retrieve historical video for investigations.
**Why this priority**: Important for compliance and investigations but not required for basic live monitoring. Can be added after core live viewing is functional.
**Independent Test**: Can be fully tested by configuring recording schedules, starting/stopping recording on specific cameras, and querying recorded footage by time range, delivering complete recording management.
**Acceptance Scenarios**:
1. **Given** an authenticated administrator, **When** they request recording start on camera 2, **Then** the camera begins recording and they receive confirmation with recording ID
2. **Given** a time range query for 2025-11-12 14:00 to 16:00 on camera 5, **When** an investigator searches for recordings, **Then** they receive a list of available recording segments with download URLs
3. **Given** the ring buffer is at 90% capacity, **When** an administrator checks recording capacity, **Then** they receive an alert indicating low storage and oldest recordings that will be overwritten
4. **Given** scheduled recording configured for nighttime hours, **When** the schedule time arrives, **Then** recording automatically starts and stops according to the schedule
---
### User Story 6 - Video Analytics Configuration (Priority: P2)
As a security administrator, I need to configure video content analysis features (motion detection, object tracking, perimeter protection) so that the system can automatically detect security-relevant events.
**Why this priority**: Enhances system capabilities but requires basic video viewing to already be working. Analytics configuration is valuable but not essential for day-one operation.
**Independent Test**: Can be fully tested by configuring motion detection zones on a camera, triggering motion, and verifying analytics events are generated, delivering automated detection capabilities.
**Acceptance Scenarios**:
1. **Given** an authenticated administrator, **When** they configure motion detection zones on camera 4, **Then** the configuration is saved and motion detection activates within those zones
2. **Given** motion detection configured with sensitivity level 7, **When** motion occurs in the detection zone, **Then** a motion detection event is generated and sent to event subscribers
3. **Given** object tracking enabled on camera 6, **When** a person enters the frame, **Then** the system assigns a tracking ID and sends position updates for the duration they remain visible
4. **Given** multiple analytics enabled on one camera (VMD + OBTRACK), **When** events occur, **Then** all configured analytics generate appropriate events without interfering with each other
---
### User Story 7 - Multi-Camera Management (Priority: P2)
As a security operator, I need to view and manage multiple cameras simultaneously via the API so that I can coordinate surveillance across different locations and camera views.
**Why this priority**: Enhances operational efficiency but single-camera operations must work first. Important for professional surveillance operations managing multiple sites.
**Independent Test**: Can be fully tested by retrieving a list of all available cameras, requesting multiple streams simultaneously, and grouping cameras by location, delivering multi-camera coordination.
**Acceptance Scenarios**:
1. **Given** an authenticated user, **When** they request the camera list from `/api/v1/cameras`, **Then** they receive all cameras they have permission to view with status, channel ID, capabilities, and location metadata
2. **Given** multiple cameras in the same location, **When** a user requests grouped camera data, **Then** cameras are organized by configured location/zone for easy navigation
3. **Given** a user viewing 16 camera streams, **When** they request streams via the API, **Then** all 16 streams initialize and display without individual stream degradation
4. **Given** a camera goes offline while being viewed, **When** the API detects the disconnection, **Then** the camera status updates and subscribers receive a notification
---
### User Story 8 - License Plate Recognition Integration (Priority: P3)
As a security operator monitoring vehicle access, I need to receive automatic license plate recognition events so that I can track vehicle entry/exit and match against watchlists.
**Why this priority**: Valuable for specific use cases (parking, access control) but not universal. Only relevant if NPR hardware is available and configured.
**Independent Test**: Can be fully tested by configuring NPR zones, driving a test vehicle through the zone, and verifying plate recognition events with captured plate numbers, delivering automated vehicle tracking.
**Acceptance Scenarios**:
1. **Given** NPR configured on camera 9 with recognition zone defined, **When** a vehicle with readable plate enters the zone, **Then** an NPR event is generated containing plate number, country code, timestamp, confidence score, and image snapshot
2. **Given** a watchlist of plates configured, **When** a matching plate is recognized, **Then** a high-priority alert is sent to subscribers with match details
3. **Given** poor lighting or plate obstruction, **When** recognition fails or confidence is low (<70%), **Then** the event includes the best-guess plate and confidence level so operators can manually verify
4. **Given** continuous vehicle traffic, **When** multiple vehicles pass through rapidly, **Then** each vehicle generates a separate NPR event with unique tracking ID
---
### User Story 9 - Video Export and Backup (Priority: P3)
As a security investigator, I need to export specific video segments for evidence or sharing so that I can provide footage to law enforcement or use in incident reports.
**Why this priority**: Useful for investigations but not needed for live monitoring or basic recording. Can be added as an enhancement after core features are stable.
**Independent Test**: Can be fully tested by requesting export of a 10-minute segment from camera 3, receiving a download URL, and verifying the exported file plays correctly, delivering evidence export capability.
**Acceptance Scenarios**:
1. **Given** an authenticated investigator, **When** they request export of camera 8 footage from 10:00-10:15 on 2025-11-12, **Then** they receive an export job ID and can poll for completion status
2. **Given** an export job in progress, **When** the investigator checks job status, **Then** they receive progress percentage and estimated completion time
3. **Given** a completed export, **When** the investigator downloads the file, **Then** they receive a standard video format (MP4/AVI) playable in common media players with embedded timestamps
4. **Given** an export request for a time range with no recordings, **When** processing occurs, **Then** the user receives a clear message indicating no footage available for that timeframe
---
### User Story 10 - System Health Monitoring (Priority: P3)
As a system administrator, I need to monitor API and surveillance system health status so that I can proactively identify and resolve issues before they impact operations.
**Why this priority**: Important for production systems but not required for initial deployment. Health monitoring is an operational enhancement that can be added incrementally.
**Independent Test**: Can be fully tested by querying the health endpoint, checking SDK connectivity status, and verifying alerts when components fail, delivering system observability.
**Acceptance Scenarios**:
1. **Given** the API is running, **When** an unauthenticated user requests `/api/v1/health`, **Then** they receive system status including API uptime, SDK connectivity, database status, and overall health score
2. **Given** the GeViScope SDK connection fails, **When** health is checked, **Then** the health endpoint returns degraded status with specific SDK error details
3. **Given** disk space for recordings drops below 10%, **When** monitoring checks run, **Then** a warning is included in health status and administrators receive notification
4. **Given** an administrator monitoring performance, **When** they request detailed metrics, **Then** they receive statistics on request throughput, average response times, active WebSocket connections, and concurrent streams
---
### Edge Cases
- What happens when a camera is physically disconnected while being actively viewed by 20 users?
- How does the system handle authentication when the GeViScope SDK is temporarily unavailable?
- What occurs when a user requests PTZ control on a camera that another user is already controlling?
- How does recording behave when the ring buffer reaches capacity during an active alarm event?
- What happens when network latency causes event notifications to queue up - does the system batch or drop old events?
- How does the API respond when a user has permission for 50 cameras but only 30 are currently online?
- What occurs when a WebSocket connection drops mid-event notification?
- How does the system handle time zone differences between the API server, GeViScope SDK, and client applications?
- What happens when an export request spans a time range that crosses a recording gap (camera was off)?
- How does analytics configuration respond when applied to a camera that doesn't support the requested analytics type (e.g., NPR on a camera without NPR hardware)?
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: System MUST authenticate all API requests using JWT tokens with configurable expiration (default 1 hour for access tokens, 7 days for refresh tokens)
- **FR-002**: System MUST implement role-based access control with at least three roles: viewer (read-only camera access), operator (camera control + viewing), administrator (full system configuration)
- **FR-003**: System MUST provide granular permissions allowing access restriction per camera channel
- **FR-004**: System MUST expose live video streams for all configured GeViScope channels with initialization time under 2 seconds
- **FR-005**: System MUST support PTZ control operations (pan, tilt, zoom, preset positions) with command response time under 500ms
- **FR-006**: System MUST provide WebSocket endpoint for real-time event notifications with delivery latency under 100ms
- **FR-007**: System MUST support event subscriptions by type (alarms, analytics, system events) and by camera channel
- **FR-008**: System MUST translate all GeViScope SDK actions to RESTful API endpoints following the pattern `/api/v1/{resource}/{id}/{action}`
- **FR-009**: System MUST handle concurrent video stream requests from minimum 100 simultaneous users without degradation
- **FR-010**: System MUST support WebSocket connections from minimum 1000 concurrent clients for event notifications
- **FR-011**: System MUST provide recording management including start/stop recording, schedule configuration, and recording status queries
- **FR-012**: System MUST expose recording capacity metrics including total capacity, free space, recording depth in hours, and oldest recording timestamp
- **FR-013**: System MUST support video analytics configuration for VMD (Video Motion Detection), OBTRACK (object tracking and people counting), NPR (license plate recognition), and G-Tect (perimeter protection) where hardware supports these features
- **FR-014**: System MUST provide query capabilities for recorded footage by channel, time range, and event association
- **FR-015**: System MUST export video segments in standard formats (MP4 or AVI) with embedded timestamps and metadata
- **FR-016**: System MUST log all authentication attempts (successful and failed) with username, source IP, and timestamp
- **FR-017**: System MUST audit log all privileged operations including PTZ control, recording management, configuration changes, and user management with operator ID, action, target, and timestamp
- **FR-018**: System MUST gracefully handle camera offline scenarios by returning appropriate error codes and status information
- **FR-019**: System MUST implement retry logic for transient SDK communication failures (3 attempts with exponential backoff)
- **FR-020**: System MUST provide health check endpoint returning API status, SDK connectivity, database availability, and system resource usage
- **FR-021**: System MUST serve auto-generated OpenAPI/Swagger documentation at `/docs` endpoint
- **FR-022**: System MUST return meaningful error messages with error codes for all failure scenarios without exposing internal stack traces
- **FR-023**: System MUST support API versioning in URL path (v1, v2) to allow backward-compatible evolution
- **FR-024**: System MUST rate limit authentication attempts to prevent brute force attacks (max 5 attempts per IP per minute)
- **FR-025**: System MUST enforce TLS 1.2+ for all API communication in production environments
- **FR-026**: System MUST translate Windows error codes from GeViScope SDK to appropriate HTTP status codes with user-friendly messages
- **FR-027**: System MUST support filtering and pagination for endpoints returning lists (camera lists, recording lists, event histories)
- **FR-028**: System MUST handle GeViScope SDK ring buffer architecture by exposing recording depth and capacity warnings when storage approaches limits
- **FR-029**: System MUST support event correlation using ForeignKey parameter to link events with external system identifiers
- **FR-030**: System MUST allow configuration of pre-alarm and post-alarm recording duration for event-triggered recordings
### Key Entities
- **Camera**: Represents a video input channel with properties including channel ID, name, location, capabilities (PTZ support, analytics support), current status (online/offline/recording), stream URL, and permissions
- **User**: Authentication entity with username, hashed password, assigned role, permissions list, JWT tokens, and audit trail of actions
- **Event**: Surveillance occurrence with type ID (motion, alarm, analytics), event ID (instance), channel, timestamp, severity, associated data (e.g., NPR plate number, object tracking ID), and foreign key for external correlation
- **Recording**: Video footage segment with channel, start time, end time, file size, recording trigger (scheduled, event, manual), and retention policy
- **Stream**: Active video stream session with channel, user, start time, format, quality level, and connection status
- **Analytics Configuration**: Video content analysis settings with type (VMD, NPR, OBTRACK, G-Tect, CPA), channel, enabled zones/regions, sensitivity parameters, and alert thresholds
- **PTZ Preset**: Saved camera position with preset ID, channel, name, pan/tilt/zoom values
- **Audit Log Entry**: Security and operations record with timestamp, user, action type, target resource, outcome (success/failure), and detailed parameters
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: Developers can authenticate and make their first successful API call within 10 minutes of reading the quick start documentation
- **SC-002**: Security operators can view live video from any authorized camera with video appearing on screen within 2 seconds of request
- **SC-003**: PTZ camera movements respond to operator commands within 500ms, providing responsive control for incident investigation
- **SC-004**: Real-time event notifications are delivered to subscribed clients within 100ms of event occurrence, enabling rapid incident response
- **SC-005**: System supports 100 concurrent video streams without any individual stream experiencing frame drops or quality degradation
- **SC-006**: System handles 1000+ concurrent WebSocket connections for event notifications with message delivery rates exceeding 99.9%
- **SC-007**: API metadata queries (camera lists, status checks, user info) return results in under 200ms for 95% of requests
- **SC-008**: System maintains 99.9% uptime during production operation, measured as availability of the health check endpoint
- **SC-009**: Operators can successfully complete all primary surveillance tasks (view cameras, control PTZ, receive alerts, query recordings) without requiring technical support
- **SC-010**: API documentation is sufficiently complete that 90% of integration questions can be answered by reading the OpenAPI specification and examples
- **SC-011**: Failed authentication attempts are logged and administrators receive alerts for potential security threats within 5 minutes of detection
- **SC-012**: Video export requests for segments up to 1 hour complete within 5 minutes and produce files playable in standard media players
- **SC-013**: System gracefully handles camera failures, with offline cameras clearly indicated and the API remaining operational for all other cameras
- **SC-014**: Recording capacity warnings are provided when storage reaches 80% capacity, allowing administrators to take action before recordings are lost
- **SC-015**: During peak load (500 requests/second), the system maintains response time targets with no more than 0.1% of requests timing out or failing
### Business Impact
- **BI-001**: Custom surveillance applications can be developed and deployed in under 1 week using the API, compared to 4-6 weeks with direct SDK integration
- **BI-002**: Reduction in support requests by 60% compared to direct SDK usage, as API abstracts SDK complexity and provides clear error messages
- **BI-003**: Enable integration with third-party systems (access control, building management, alarm systems) that previously couldn't interface with GeViScope
- **BI-004**: Support mobile and web-based surveillance clients that can't run Windows SDK, expanding platform compatibility
## Dependencies *(mandatory)*
### External Dependencies
- **GeViScope SDK 7.9.975.68+**: Core surveillance system SDK providing video streams, camera control, and event management
- **Windows Server 2016+** or **Windows 10/11**: Required platform for GeViScope SDK operation
- **Active Geutebruck Surveillance System**: Physical cameras, recording servers, and network infrastructure must be configured and operational
### Assumptions
- GeViScope SDK is already installed and configured with cameras connected and functional
- Network connectivity exists between API server and GeViScope SDK service
- Sufficient storage capacity available for ring buffer recording as configured in GeViScope
- Client applications can consume RESTful APIs and WebSocket connections
- Authentication credentials for GeViScope SDK are available for API integration
- Standard industry retention and performance expectations apply unless otherwise specified by regulations
- JWT-based authentication is acceptable for client applications (OAuth2 flow not required initially)
- Video streaming will use existing GeViScope streaming protocols (direct URL or stream proxy to be determined during technical planning)
- Redis or similar in-memory database available for session management and caching
- SSL/TLS certificates can be obtained and configured for production deployment
### Out of Scope
- Direct camera hardware management (firmware updates, network configuration) - handled by GeViScope
- Video storage architecture changes - API uses existing GeViScope ring buffer
- Custom video codec development - API uses GeViScope's supported formats
- Mobile native SDKs - this specification covers REST API only, client SDKs are separate future work
- Video wall display management - API provides data, UI implementation is client responsibility
- Bi-directional audio communication - audio monitoring may be included but two-way audio is deferred
- Access control system integration - API provides data interfaces but integration logic is external
- Custom analytics algorithm development - API configures existing GeViScope analytics, custom algorithms are separate work
## Constraints
### Technical Constraints
- API must run on Windows platform due to GeViScope SDK dependency
- All video operations must use GeViScope's channel-based architecture (Channel ID parameter required)
- Event notifications limited to events supported by GeViScope SDK action system
- Recording capabilities bounded by GeViScope SDK's ring buffer architecture
- Analytics features only available for cameras with hardware support (cannot enable NPR on camera without NPR hardware)
### Performance Constraints
- Maximum concurrent streams limited by GeViScope SDK license and hardware capacity
- WebSocket connection limits determined by operating system socket limits and available memory
- API response times dependent on GeViScope SDK response characteristics
- Video stream initialization time includes SDK processing delay (targeted under 2 seconds total)
### Security Constraints
- All API communication must use TLS 1.2+ in production
- JWT tokens must have configurable expiration to balance security and usability
- Audit logging must be tamper-evident (append-only, with checksums or write to immutable storage)
- Credentials for GeViScope SDK must be stored securely (environment variables, key vault)
## Risk Analysis
### High Impact Risks
1. **GeViScope SDK Stability**: If SDK crashes or becomes unresponsive, API loses all functionality
- *Mitigation*: Implement circuit breaker pattern, health monitoring, automatic SDK reconnection logic
2. **Performance Under Load**: Concurrent stream limits may be lower than target (100 streams)
- *Mitigation*: Load testing early in development, potentially implement stream quality adaptation
3. **Windows Platform Dependency**: Restricts deployment options and increases operational complexity
- *Mitigation*: Document Windows container approach, design SDK bridge for potential future Linux support via proxy
### Medium Impact Risks
4. **SDK Version Compatibility**: Future GeViScope SDK updates may break API integration
- *Mitigation*: Version testing before SDK upgrades, maintain SDK abstraction layer
5. **WebSocket Scalability**: 1000+ concurrent connections may stress resources
- *Mitigation*: Connection pooling, message batching, load testing, potential horizontal scaling
6. **Network Latency**: Event notifications and video streams sensitive to network conditions
- *Mitigation*: Document network requirements, implement connection quality monitoring
### Low Impact Risks
7. **Documentation Drift**: API changes may outpace documentation updates
- *Mitigation*: Auto-generated OpenAPI specs from code, documentation review in PR process
## Notes
This specification focuses on **WHAT** the API enables users to do and **WHY** it's valuable, avoiding **HOW** it will be implemented. Technical decisions about Python/FastAPI, specific database choices, video streaming protocols, and SDK integration mechanisms will be made during the `/speckit.plan` phase.
The user stories are prioritized for iterative development:
- **P1 stories** (1-4) form the MVP: authentication, live viewing, PTZ control, event notifications
- **P2 stories** (5-7) add operational capabilities: recording management, analytics configuration, multi-camera coordination
- **P3 stories** (8-10) provide enhancements: specialized analytics (NPR), evidence export, system monitoring
Each story is independently testable and delivers standalone value, enabling flexible development sequencing and incremental delivery to users.