Chapter Build — TDHCA Multifamily Management System

Chapter 6: Non-Functional Requirements

Approved
Score: 84/100 Words: 1859
# Chapter 6: Non-Functional Requirements

> **Chapter purpose**: This chapter provides the design intent and implementation guidance for Non-Functional Requirements. The first step is understanding the inputs and outputs, then identifying dependencies and prerequisites before implementation.

# Chapter 6: Non-Functional Requirements

In this chapter, we will delve into the non-functional requirements (NFRs) that are critical for the successful deployment and operation of the cloud-based web application designed to streamline the manual workflows of the Texas Department of Housing and Community Affairs (TDHCA) underwriters. These requirements encompass performance, scalability, availability, reliability, monitoring, disaster recovery, and accessibility standards. By addressing these NFRs, we aim to ensure that the application not only meets functional expectations but also adheres to high standards of quality, security, and user experience.

## Performance Requirements

Performance is a key aspect of user satisfaction and operational efficiency. The application must be designed to handle a significant number of concurrent users while maintaining a responsive user interface. The following performance requirements have been identified:

1. **Response Time**: The application must respond to user actions within 2 seconds under normal load conditions. This includes actions such as submitting applications, generating reports, and loading dashboards. The response time should not exceed 4 seconds under peak load conditions, which is defined as 200 concurrent users.

2. **Throughput**: The system should be capable of processing at least 100 applications per hour. This metric will be monitored to ensure that the application can handle the expected workload without degradation in performance.

3. **Resource Utilization**: CPU and memory usage should remain below 70% during peak load conditions to ensure that the application can scale effectively without performance degradation. This will be monitored using cloud-based monitoring tools such as AWS CloudWatch or Azure Monitor.

4. **Load Testing**: The application must undergo load testing using tools such as Apache JMeter or Gatling to simulate user interactions and measure performance metrics. Load tests should be conducted prior to deployment and after major updates.

5. **Caching Strategy**: To improve performance, a caching layer will be implemented using Redis. Frequently accessed data, such as user profiles and application statuses, will be cached to reduce database load and improve response times. The caching strategy will be defined in the configuration files as follows:
   ```yaml
   cache:
     enabled: true
     type: redis
     ttl: 300  # Time to live in seconds
   ```

6. **API Performance**: All API endpoints must return responses within 200 milliseconds under normal load conditions. This will be validated through automated performance tests integrated into the CI/CD pipeline.

## Scalability Approach

Scalability is essential for accommodating growth in user demand and application usage. The application architecture will be designed to support both vertical and horizontal scaling. The following strategies will be employed:

1. **Microservices Architecture**: The application will be decomposed into microservices, each responsible for a specific business capability. This allows individual services to be scaled independently based on demand. For example, the user management service can be scaled separately from the reporting service.

2. **Containerization**: All services will be containerized using Docker, enabling easy deployment and scaling across different environments. The Dockerfile for each service will specify the necessary dependencies and configurations. An example Dockerfile for the user service is as follows:
   ```dockerfile
   FROM node:14
   WORKDIR /app
   COPY package.json .
   RUN npm install
   COPY . .
   CMD ["npm", "start"]
   ```

3. **Kubernetes Orchestration**: The application will be deployed on a Kubernetes cluster, allowing for automated scaling and management of containerized applications. Horizontal Pod Autoscalers (HPA) will be configured to automatically adjust the number of pods based on CPU utilization or custom metrics.

4. **Database Sharding**: The database will be designed to support sharding, allowing data to be distributed across multiple database instances. This will improve read and write performance as the user base grows. The sharding strategy will be defined in the database configuration as follows:
   ```json
   {
     "sharding": {
       "enabled": true,
       "shardCount": 4
     }
   }
   ```

5. **Load Balancing**: A load balancer will be implemented to distribute incoming traffic across multiple instances of the application. This will ensure that no single instance becomes a bottleneck. The load balancer configuration will include health checks to route traffic only to healthy instances.

6. **Asynchronous Processing**: Long-running tasks, such as generating reports or sending notifications, will be processed asynchronously using a message queue (e.g., RabbitMQ). This will prevent blocking of user interactions and improve overall application responsiveness. The message queue configuration will be defined in the application settings:
   ```yaml
   messageQueue:
     type: rabbitmq
     host: rabbitmq.example.com
     queueName: reportGeneration
   ```

## Availability & Reliability

Ensuring high availability and reliability is crucial for the application, especially given its role in processing sensitive housing data. The following strategies will be implemented:

1. **Uptime Guarantee**: The application must achieve an uptime of 99.9% over a rolling 30-day period. This will be monitored using uptime monitoring tools such as Pingdom or UptimeRobot.

2. **Redundancy**: All critical components of the application, including the database and application servers, will be deployed in a redundant configuration across multiple availability zones (AZs). This will ensure that the application remains operational even in the event of an AZ failure.

3. **Health Checks**: Automated health checks will be implemented for all services to monitor their status and performance. If a service fails a health check, it will be automatically restarted or replaced by Kubernetes.

4. **Graceful Degradation**: In the event of a service failure, the application will be designed to degrade gracefully. For example, if the reporting service is unavailable, users will still be able to submit applications and access other features without interruption.

5. **Backup Strategy**: Regular backups of the database and application data will be performed to ensure data integrity and availability. Backups will be stored in a separate location and tested regularly for restoration. The backup schedule will be defined in the deployment scripts:
   ```bash
   # Backup script
   pg_dump -U db_user -h db_host db_name > backup.sql
   ```

6. **Incident Response Plan**: An incident response plan will be developed to outline procedures for responding to outages or security incidents. This plan will include communication protocols, escalation paths, and post-incident reviews.

## Monitoring & Alerting

Effective monitoring and alerting are essential for maintaining application performance and reliability. The following monitoring strategies will be implemented:

1. **Application Performance Monitoring (APM)**: APM tools such as New Relic or Datadog will be integrated to monitor application performance metrics, including response times, error rates, and throughput. This will provide insights into application behavior and help identify performance bottlenecks.

2. **Log Management**: Centralized logging will be implemented using tools such as ELK Stack (Elasticsearch, Logstash, Kibana) or Splunk. All application logs will be collected and analyzed to detect anomalies and troubleshoot issues. Log retention policies will be defined to manage storage costs:
   ```json
   {
     "logRetentionDays": 30
   }
   ```

3. **Alerting Mechanisms**: Alerts will be configured to notify the DevOps team of critical issues, such as high error rates or service downtime. Alerts will be sent via email, Slack, or SMS, depending on the severity of the issue. An example alert configuration is as follows:
   ```yaml
   alerts:
     - name: HighErrorRate
       condition: "errorRate > 5%"
       action: "notify-team"
   ```

4. **User Analytics**: User behavior and event tracking will be implemented using tools such as Mixpanel or Amplitude. This will provide insights into user interactions and help identify areas for improvement in the user experience.

5. **Infrastructure Monitoring**: Infrastructure components, including servers and databases, will be monitored using cloud provider tools (e.g., AWS CloudWatch) to track resource utilization and performance metrics. Alerts will be configured for resource thresholds to prevent outages.

6. **Regular Review**: Monitoring dashboards and alert configurations will be reviewed regularly to ensure they remain relevant and effective. This will include updating thresholds based on historical data and user feedback.

## Disaster Recovery

A comprehensive disaster recovery (DR) plan is essential to ensure business continuity in the event of a catastrophic failure. The following strategies will be implemented:

1. **Disaster Recovery Plan**: A formal disaster recovery plan will be documented, outlining the steps to be taken in the event of a disaster. This plan will include roles and responsibilities, communication protocols, and recovery time objectives (RTOs) and recovery point objectives (RPOs).

2. **Geographic Redundancy**: The application will be deployed across multiple geographic regions to ensure that a failure in one region does not impact the availability of the application. This will involve replicating data and services across regions.

3. **Data Backup and Restoration**: Regular backups of application data will be performed, and restoration procedures will be tested to ensure data can be recovered quickly in the event of data loss. Backup frequency will be defined based on the criticality of the data:
   ```yaml
   backup:
     frequency: daily
     retention: 14 days
   ```

4. **Failover Mechanisms**: Automated failover mechanisms will be implemented to switch to backup systems in the event of a primary system failure. This will involve configuring DNS failover and load balancer settings to redirect traffic to backup instances.

5. **Testing and Drills**: Regular disaster recovery drills will be conducted to test the effectiveness of the DR plan. This will involve simulating various disaster scenarios and evaluating the response and recovery times.

6. **Documentation and Training**: All team members will be trained on the disaster recovery plan, and documentation will be kept up to date to reflect any changes in the application architecture or processes.

## Accessibility Standards

Ensuring that the application is accessible to all users, including those with disabilities, is a fundamental requirement. The following accessibility standards will be adhered to:

1. **WCAG Compliance**: The application will be designed to comply with the Web Content Accessibility Guidelines (WCAG) 2.1 at the AA level. This includes ensuring that all content is perceivable, operable, understandable, and robust for users with disabilities.

2. **Keyboard Navigation**: All interactive elements of the application must be navigable using a keyboard. This includes forms, buttons, and links. Focus indicators will be implemented to provide visual feedback on keyboard navigation.

3. **Screen Reader Compatibility**: The application will be tested for compatibility with screen readers, ensuring that all content is properly announced and that users can navigate the application effectively.

4. **Color Contrast**: All text and interactive elements will meet minimum color contrast ratios to ensure readability for users with visual impairments. Tools such as the WebAIM Contrast Checker will be used to validate color choices.

5. **Alternative Text**: All images and non-text content will include alternative text descriptions to provide context for users who rely on assistive technologies. This will be implemented in the HTML as follows:
   ```html
   <img src="image.jpg" alt="Description of the image">
   ```

6. **User Testing**: Accessibility testing will be conducted with real users who have disabilities to identify any barriers to usability. Feedback will be incorporated into the design and development process to continuously improve accessibility.

## Conclusion

This chapter has outlined the non-functional requirements that are essential for the successful deployment and operation of the TDHCA underwriters' web application. By focusing on performance, scalability, availability, reliability, monitoring, disaster recovery, and accessibility standards, we aim to create a robust and user-friendly application that meets the needs of its users while ensuring compliance with relevant regulations. These NFRs will be continuously monitored and refined throughout the development lifecycle to adapt to changing user needs and technological advancements.