System testing¶
Install the plugins¶
Test Case ID | install_lma_plugins |
Description | Verify that the plugins can be installed. |
Prerequisites | N/A |
Steps¶
- Copy the 4 plugins to the Fuel master node using scp.
- Connect to the Fuel master node using ssh.
- Install the plugins using the fuel CLI.
- Connect to the Fuel web UI.
- Create a new environment using the Fuel UI Wizard.
- Click on the Plugins tab.
Expected Result¶
The 4 plugins are present in the Fuel UI.
Deploy an environment with the plugins¶
Test Case ID | deploy_lma_plugins |
Description | Verify that the plugins can be deployed. |
Prerequisites | Plugins are installed on the Fuel master node (see Install the plugins). |
Steps¶
Connect to the Fuel web UI.
Create a new environment with the Fuel UI wizard with the default settings.
Click on the Settings tab of the Fuel web UI.
Select the LMA collector plugin tab and fill-in the following fields:
- Enable the plugin.
- Select ‘Local node’ for “Event analytics”.
- Select ‘Local node’ for “Metric analytics”.
- Select ‘Alerts sent to a local node running the LMA Infrastructure Alerting plugin’ for “Alerting”.
Select the Elasticsearch-Kibana plugin tab and enable it.
Select the InfluxDB-Grafana plugin and fill-in the required fields:
- Enable the plugin.
- Enter ‘lmapass’ as the root, user and grafana user passwords.
Select the LMA Infrastructure-Alerting plugin and fill-in the required fields:
- Enable the plugin.
- Enter ‘root@localhost’ as the recipient
- Enter ‘nagios@localhost’ as the sender
- Enter ‘127.0.0.1’ as the SMTP server address
- Choose “None” for SMTP authentication (default)
Click on the Nodes tab of the Fuel web UI.
Assign roles to nodes:
1 node with these 3 roles (this node is referenced later as the ‘lma’ node):
- influxdb_grafana
- elasticsearch_kibana
- infrastructure_alerting
3 nodes with the ‘controller’ role
1 node with the ‘compute’ + ‘cinder’ node
Click ‘Deploy changes’.
Once the deployment has finished, connect to each node of the environment using ssh and run the following checks:
- Check that hekad and collectd processes are up and running on all the nodes as described in the LMA Collector documentation.
- Look for errors in /var/log/lma_collector.log
- Check that the node can connect to the Elasticsearch server (
http://<IP address of the 'lma' node>:9200/
) - Check that the node can connect to the InfluxDB server (
http://<IP address of the 'lma' node>:8086/
)
Check that the dashboards are running
- Check that you can connect to the Kibana UI (
http://<IP address of the 'lma' node>:80/
) - Check that you can connect to the Grafana UI (
http://<IP address of the 'lma' node>:8000/
) with user=’lma’, password=’lmapass’ - Check that you can connect to the Nagios UI (
http://<IP address of the 'lma' node>:8001/
) with user=’nagiosadmin’, password=’r00tme’
- Check that you can connect to the Kibana UI (
Expected Result¶
The environment is deployed successfully.
Add/remove controller nodes in existing environment¶
Test Case ID | modify_env_with_plugin_remove_add_controller |
Description | Verify that the number of controllers can scale up and down. |
Prerequisites | Environment deployed with the 4 plugins (See Deploy an environment with the plugins). |
Steps¶
- Remove 1 node with the controller role.
- Re-deploy the cluster.
- Check the plugin services using the CLI
- Check in the Nagios UI that the removed node is no longer monitored.
- Run the health checks (OSTF).
- Add 1 new node with the controller role.
- Re-deploy the cluster.
- Check the plugin services using the CLI.
- Check in the Nagios UI that the new node is monitored.
- Run the health checks (OSTF).
Expected Result¶
The OSTF tests pass successfully.
All the plugin services are running and work as expected after each modification of the environment.
The Nagios service has been reconfigured to take care of the node removal and addition.
Add/remove compute nodes in existing environment¶
Test Case ID | modify_env_with_plugin_remove_add_compute |
Description | Verify that the number of computes can scale up and down. |
Prerequisites | Environment deployed with the 4 plugins (See Deploy an environment with the plugins). |
Steps¶
- Remove 1 node with the compute role.
- Re-deploy the cluster.
- Check the plugin services using the CLI
- Check in the Nagios UI that the removed node is no longer monitored.
- Run the health checks (OSTF).
- Add 1 new node with the compute role.
- Re-deploy the cluster.
- Check the plugin services using the CLI.
- Check in the Nagios UI that the new node is monitored.
- Run the health checks (OSTF).
Expected Result¶
The OSTF tests pass successfully.
All the plugin services are running and work as expected after each modification of the environment.
The Nagios service has been reconfigured to take care of the node removal and addition.
Uninstall the plugins with deployed environment¶
Test Case ID | uninstall_plugin_with_deployed_env |
Description | Verify that the plugins can be uninstalled after the deployed environment is removed. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Try to remove the plugins using the Fuel CLI and ensure that the command fails with “Can’t delete plugin which is enabled for some environment”.
- Remove the environment.
- Remove the plugins.
Expected Result¶
An alert is raised when we try to delete plugins which are attached to an active environment.
After the environment is removed, the plugins are removed successfully too.
Uninstall the plugins¶
Test Case ID | uninstall_plugin |
Description | Verify that the plugins can be uninstalled. |
Prerequisites | The 4 plugins are installed on the Fuel node (see Install the plugins). |
Steps¶
- Remove the plugins.
Expected Result¶
The plugins are removed.
Functional testing¶
Display and query logs in the Kibana UI¶
Test Case ID | query_logs_in_kibana_ui |
Description | Verify that the logs show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Enter ‘programname:nova*’ in the Query box.
- Check that Nova logs are displayed.
Expected Result¶
The Kibana UI displays entries for all the controller and compute nodes deployed in the environment.
Display and query Nova notifications in the Kibana UI¶
Test Case ID | query_nova_notifications_in_kibana_ui |
Description | Verify that the Nova notifications show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Launch, update, rebuild, resize, power-off, power-on, snapshot, suspend, shutdown, and delete an instance in the OpenStack environment (using the Horizon dashboard for example) and write down the instance’s id.
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Open the Notifications dashboard using the ‘Load’ icon.
- Enter ‘instance_id:<uuid>’ in the Query box where <uuid> is the id of the launched instance.
Expected Result¶
All event types for Nova are listed except compute.instance.create.error and compute.instance.resize.revert.{start|end}.
Display and query Glance notifications in the Kibana UI¶
Test Case ID | query_glance_notifications_in_kibana_ui |
Description | Verify that the Glance notifications show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Run the OSTF platform test “Check create, update and delete image actions using Glance v2”.
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Open the Notifications dashboard using the ‘Load’ icon.
- Enter ‘glance’ in the Query box.
Expected Result¶
All event types for Glance are listed.
Display and query Cinder notifications in the Kibana UI¶
Test Case ID | query_cinder_notifications_in_kibana_ui |
Description | Verify that the cinder notifications show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Create and update a volume in the OpenStack environment (using the Horizon dashboard for example) and write down the volume id.
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Open the Notifications dashboard using the ‘Load’ icon.
- Enter ‘volume_id:<uuid>’ in the Query box where <uuid> is the id of the created volume.
Expected Result¶
All event types for Cinder are listed.
Display and query Heat notifications in the Kibana UI¶
Test Case ID | query_heat_notifications_in_kibana_ui |
Description | Verify that the heat notifications show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Run all OSTF Heat platform tests.
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Open the Notifications dashboard using the ‘Load’ icon.
- Enter ‘heat’ in the Query box.
Expected Result¶
All event types for Heat are listed.
Display and query Neutron notifications in the Kibana UI¶
Test Case ID | query_neutron_notifications_in_kibana_ui |
Description | Verify that the Neutron notifications show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Run OSTF functional tests: ‘Create security group’ and ‘Check network connectivity from instance via floating IP’.
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Open the Notifications dashboard using the ‘Load’ icon.
- Enter ‘neutron’ in the Query box.
Expected Result¶
All event types for Neutron are listed.
Display and query Keystone notifications in the Kibana UI¶
Test Case ID | query_keystone_notifications_in_kibana_ui |
Description | Verify that the Keystone notifications show up in the Kibana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Run OSTF platform test: ‘Create user and authenticate with it to Horizon’.
- Open the Kibana URL at
http://<IP address of the 'lma' node>/
- Open the Notifications dashboard using the ‘Load’ icon.
- Enter ‘keystone’ in the Query box.
Expected Result¶
All event types for Keystone are listed.
Display the dashboards in the Grafana UI¶
Test Case ID | display_dashboards_in_grafana_ui |
Description | Verify that the dashboards show up in the Grafana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
Open the Grafana URL at
http://<IP address of the 'lma' node>:8000/
Sign-in using the credentials provided during the configuration of the environment.
Go to the Main dashboard and verify that everything is ok.
Repeat the previous step for the following dashboards:
- Cinder
- Glance
- Heat
- Keystone
- Nova
- Neutron
- HAProxy
- RabbitMQ
- MySQL
- Apache
- Memcached
- System
- LMA Self-monitoring
Expected Result¶
The Grafana UI shows the overall status of the OpenStack services and detailed statistics about the selected controller.
Display the Nova metrics in the Grafana UI¶
Test Case ID | display_nova_metrics_in_grafana_ui |
Description | Verify that the Nova metrics show up in the Grafana UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
- Open the Grafana URL at
http://<IP address of the 'lma' node>:8000/
- Sign-in using the credentials provided during the configuration of the environment.
- Go to the Nova dashboard.
- Connect to the Fuel web UI, launch the full suite of OSTF tests and wait for their completion.
- Check that the ‘instance creation time’ graph in the Nova dashboard reports values.
Expected Result¶
The Grafana UI shows the instance creation time over time.
Report service alerts with warning severity¶
Test Case ID | report_service_alerts_with_warning_severity |
Description | Verify that the warning alerts for services show up in the Grafana and Nagios UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
Open the Grafana URL at
http://<IP address of the 'lma' node>:8000/
and load the Nova dashboard.Open the Nagios URL at
http://<IP address of the 'lma' node>:8001/
in another tab and click the ‘Services’ menu item.Connect to one of the controller nodes using ssh and stop the nova-api service.
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘WARN’ with an orange background,
- the API panels report 1 entity as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘WARNING’ state,
- the local user root on the lma node has received an email about the service being in warning state.
Restart the nova-api service.
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
- the API panels report 0 entity as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘OK’ state,
- the local user root on the lma node has received an email about the recovery of the service.
Stop the nova-scheduler service.
Wait for at least 3 minutes.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘WARN’ with an orange background,
- the scheduler panel reports 1 entity as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘WARNING’ state,
- the local user root on the lma node has received an email about the service being in warning state.
Restart the nova-scheduler service.
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
- the scheduler panel reports 0 entity as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘OK’ state,
- the local user root on the lma node has received an email about the recovery of the service.
Repeat steps 2 to 18 for the following services:
- Cinder (stopping and starting the cinder-api and cinder-scheduler services respectively).
- Neutron (stopping and starting the neutron-server and neutron-openvswitch-agent services respectively).
Repeat steps 2 to 10 for the following services:
- Glance (stopping and starting the glance-api service).
- Heat (stopping and starting the heat-api service).
- Keystone (stopping and starting the Apache service).
Expected Result¶
The Grafana UI shows that the global service status goes from ok to warning and back to ok. It also reports detailed information about which entity is missing.
The Nagios UI shows that the service status goes from ok to warning and back to ok. Alerts are sent by email to the configured recipient.
Report service alerts with critical severity¶
Test Case ID | report_service_alerts_with_critical_severity |
Description | Verify that the critical alerts for services show up in the Grafana and Nagios UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
Open the Grafana URL at
http://<IP address of the 'lma' node>:8000/
and load the Nova dashboard.Open the Nagios URL at
http://<IP address of the 'lma' node>:8001/
in another tab and click the ‘Services’ menu item.Connect to one of the controller nodes using ssh and stop the nova-api service.
Connect to a second controller node using ssh and stop the nova-api service.
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘CRIT’ with a red background,
- the API panels report 2 entities as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘CRITICAL’ state,
- the local user root on the lma node has received an email about the service being in critical state.
Restart the nova-api service on both nodes.
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
- the API panels report 0 entity as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘OK’ state,
- the local user root on the lma node has received an email about the recovery of the service.
Connect to one of the controller nodes using ssh and stop the nova-scheduler service.
Connect to a second controller node using ssh and stop the nova-scheduler service.
Wait for at least 3 minutes.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘CRIT’ with a red background,
- the scheduler panel reports 2 entities as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘CRITICAL’ state,
- the local user root on the lma node has received an email about the service being in critical state.
Restart the nova-scheduler service on both nodes.
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
- the scheduler panel reports 0 entity as down.
On Nagios, check the following items:
- the ‘nova’ service is in ‘OK’ state,
- the local user root on the lma node has received an email about the recovery of the service.
Repeat steps 2 to 21 for the following services:
- Cinder (stopping and starting the cinder-api and cinder-scheduler services respectively).
- Neutron (stopping and starting the neutron-server and neutron-openvswitch-agent services respectively).
Repeat steps 2 to 11 for the following services:
- Glance (stopping and starting the glance-api service).
- Heat (stopping and starting the heat-api service).
- Keystone (stopping and starting the Apache service).
Expected Result¶
The Grafana UI shows that the global service status goes from ok to critical and back to ok. It also reports detailed information about which entity is missing.
The Nagios UI shows that the service status goes from ok to critical and back to ok. Alerts are sent by email to the configured recipient.
Report node alerts with warning severity¶
Test Case ID | report_node_alerts_with_warning_severity |
Description | Verify that the warning alerts for nodes show up in the Grafana and Nagios UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
Open the Grafana URL at
http://<IP address of the 'lma' node>:8000/
and load the MySQL dashboard.Open the Nagios URL at
http://<IP address of the 'lma' node>:8001/
in another tab and click the ‘Services’ menu item.Connect to one of the controller nodes using ssh and run:
fallocate -l $(df | grep /dev/mapper/mysql-root | awk '{ printf("%.0f\n", 1024 * ((($3 + $4) * 96 / 100) - $3))}') /var/lib/mysql/test
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
On Nagios, check the following items:
- the ‘mysql’ service is in ‘OK’ state,
- the ‘mysql-nodes.mysql-fs’ service is in ‘WARNING’ state for the node.
Connect to a second controller node using ssh and run:
fallocate -l $(df | grep /dev/mapper/mysql-root | awk '{ printf("%.0f\n", 1024 * ((($3 + $4) * 96 / 100) - $3))}') /var/lib/mysql/test
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘WARN’ with an orange background,
- an annotation telling that the service went from ‘OKAY’ to ‘WARN’ is displayed.
On Nagios, check the following items:
- the ‘mysql’ service is in ‘WARNING’ state,
- the ‘mysql-nodes.mysql-fs’ service is in ‘WARNING’ state for the 2 nodes,
- the local user root on the lma node has received an email about the service being in warning state.
Run the following command on both controller nodes:
rm /var/lib/mysql/test
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
- an annotation telling that the service went from ‘WARN’ to ‘OKAY’ is displayed.
On Nagios, check the following items:
- the ‘mysql’ service is in ‘OK’ state,
- the ‘mysql-nodes.mysql-fs’ service is in ‘OKAY’ state for the 2 nodes,
- the local user root on the lma node has received an email about the recovery of the service.
Expected Result¶
The Grafana UI shows that the global ‘mysql’ status goes from ok to warning and back to ok. It also reports detailed information about the problem in the annotations.
The Nagios UI shows that the service status goes from ok to warning and back to ok. Alerts are sent by email to the configured recipient.
Report node alerts with critical severity¶
Test Case ID | report_node_alerts_with_critical_severity |
Description | Verify that the critical alerts for nodes show up in the Grafana and Nagios UI. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
Open the Grafana URL at
http://<IP address of the 'lma' node>:8000/
and load the MySQL dashboard.Open the Nagios URL at
http://<IP address of the 'lma' node>:8001/
in another tab and click the ‘Services’ menu item.Connect to one of the controller nodes using ssh and run:
fallocate -l $(df | grep /dev/mapper/mysql-root | awk '{ printf("%.0f\n", 1024 * ((($3 + $4) * 98 / 100) - $3))}') /var/lib/mysql/test
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
On Nagios, check the following items:
- the ‘mysql’ service is in ‘OK’ state,
- the ‘mysql-nodes.mysql-fs’ service is in ‘CRITICAL’ state for the node.
Connect to a second controller node using ssh and run:
fallocate -l $(df | grep /dev/mapper/mysql-root | awk '{ printf("%.0f\n", 1024 * ((($3 + $4) * 98 / 100) - $3))}') /var/lib/mysql/test
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘CRIT’ with an red background,
- an annotation telling that the service went from ‘OKAY’ to ‘CRIT’ is displayed.
On Nagios, check the following items:
- the ‘mysql’ service is in ‘CRITICAL’ state,
- the ‘mysql-nodes.mysql-fs’ service is in ‘CRITICAL’ state for the 2 nodes,
- the local user root on the lma node has received an email about the service being in critical state.
Run the following command on both controller nodes:
rm /var/lib/mysql/test
Wait for at least 1 minute.
On Grafana, check the following items:
- the box in the upper left corner of the dashboard displays ‘OKAY’ with an green background,
- an annotation telling that the service went from ‘CRIT’ to ‘OKAY’ is displayed.
On Nagios, check the following items:
- the ‘mysql’ service is in ‘OK’ state,
- the ‘mysql-nodes.mysql-fs’ service is in ‘OKAY’ state for the 2 nodes,
- the local user root on the lma node has received an email about the recovery of the service.
Expected Result¶
The Grafana UI shows that the global ‘mysql’ status goes from ok to critical and back to ok. It also reports detailed information about the problem in the annotations.
The Nagios UI shows that the service status goes from ok to critical and back to ok. Alerts are sent by email to the configured recipient.
Non-functional testing¶
Simulate network failure on the analytics node¶
Test Case ID | network_failure_on_analytics_node |
Description | Verify that the backends and dashboards recover after a network failure. |
Prerequisites | Environment deployed with the 4 plugins (see Deploy an environment with the plugins). |
Steps¶
Copy this script to the analytics node:
#!/bin/sh /sbin/iptables -I INPUT -j DROP sleep 30 /sbin/iptables -D INPUT -j DROP
Login to the analytics node using SSH
Run the script and wait for it to complete.
Check that the Kibana, Grafana and Nagios dashboards are available.
Check that data continues to be pushed by the various nodes once the network failure has ended.
Expected Result¶
The collectors recover from the network outage of the analytics node.