General

Live Environment Ecosystem at CTERA

Trust is at the heart of everything we do at CTERA. We understand that our customers rely on our products…
By Tal Moshe
April 20, 2023

Trust is at the heart of everything we do at CTERA. We understand that our customers rely on our products to safeguard their critical data and enable their business operations. That’s why we are committed to delivering storage solutions that you can trust to operate reliably, with consistent performance, and at scale. In this article, I’d like to introduce you to our live environment, one of the tools that we have developed internally to ensure our products meet the highest standards of quality, scale, and reliability.

In Quality Assurance (QA), a live environment refers to a production-like environment in which a software application or system operates in real time with real data and real users. The live environment is the final stage of software development and testing, where the application or system is deployed for end-users to use in their daily operations.

The essence of the live environment in QA is to ensure that the application or system meets the expectations of end-users in terms of performance, functionality, usability, security, and reliability. It is important to thoroughly test and validate the application or system in the live environment before it is released to the customers to avoid any issues or errors that could negatively impact end-users.

Enhanced Quality LevelIn the live environment, QA teams can perform various tests such as acceptance testing, integration testing, regression testing, and performance testing to ensure that the application or system meets the desired quality standards. Any issues or defects found during these tests are usually addressed and resolved before the application or system is made available to end users.

The live environment is also important for ongoing maintenance and support of the application or system, as it provides real-world scenarios, use cases, and feedback that can be used to identify and resolve any issues that arise post-release. This helps ensure that the application or system continues to meet the needs and expectations of end-users over time.

The live environment is unique in that it is the only environment where the application or system is being tested at scale and for a long period of time, with real generated data and in real-world scenarios. This means that the live environment can uncover issues that may not have been detected during earlier stages of testing, such as issues related to scalability, performance, or user behavior.

In CTERA we have a live environment ecosystem, configured as a typical CTERA customer environment. The system is divided into 3 parts:

  1. The system under test
  2. A set of tools that simulate real user activities on the system
  3. A set of automated scripts on the system to verify its stability

The activity carried out on the system is completely random. Tens of thousands of user operations, such as read, write, edit, copy, and delete files and folders, are performed on the system every day, and they are all randomized in real-time. They are being performed by a unique set of tools developed by our Engineering organization, based on typical user activity patterns performed over network-based file-sharing protocols (SMB/NFS). These user activity patterns were generated and implemented based on real customer use-cases analyses. These internal tools generate a complementary log of operations which our test engineers track to immediately identify any issues and to provide a better view of the activities and their effects on the system.

Furthermore, there are dozens of automated verifications that are constantly done on the system, to ensure that it is functioning at its best. In case of any issue that is introduced by any of the verifications, an email is sent to the engineer with the relevant data, for further investigation.

Find BugsAs mentioned, the live environment is monitored constantly with automatic tools and manual verifications so that it can be considered a real high-profile customer environment with the most aggressive SLAs. There are dozens of issues found in the live environment each year. When incidents occur, the test engineer works closely with the relevant software team to analyze, report and solve the issue as their effect on the system may be crucial for its stability in the field.

The activity and verifications on our live environment are constantly updated when introducing a new feature or based on RCA for field incidents. We improve our live environment by constantly providing new real-world scenarios and use cases. Thus, more and more activities and verifications are added to the system, where the QA engineer is notified only in case of an incident that affects the system.

A daily report on the environment is generated to ensure that every activity and verification is being performed properly. The report also summarizes the activity on the system in the last 24 hours. In addition, there is a dashboard updated in real-time which displays the environment’s configuration, status, and metrics such as memory consumption, connection status, storage consumption, load, and more.

In summary, the essence of the live environment ecosystem in CTERA is to ensure that the system operates as intended in the production environment, with real operations on real generated data, based on real use cases and under real-world conditions. It is a critical part of our overall software development lifecycle, as it provides valuable feedback and insights that can help improve the overall quality of the system.

The importance of quality in CTERA cannot be overstated. As a leading hybrid cloud storage company trusted by some of the largest enterprise companies worldwide to handle their critical data, CTERA must maintain the highest standards of quality to ensure that our customers can rely on our system’s performance and security. By leveraging a rigorous live environment ecosystem, we can continuously improve and enhance our system to meet the evolving needs of our customers, and ultimately deliver the level of quality and reliability they expect from a trusted technology partner.

 

About the Author

LinkedIn ButtonTal enjoys leading QA and DEV teams and implementing pragmatic strategies to meet the highest quality standards. Prior to CTERA, Tal held QA management roles at McAfee, Sentrigo, and SofaWare.