In autumn 2025, MODUS X launched an internal GitHub Copilot adoption program. The goal was straightforward: to equip our developers with a tool that could accelerate routine tasks without compromising code quality. After 86 days of measurement and analysis, we obtained concrete data — and equally concrete conclusions about where the AI assistant truly delivers value and where it remains a useful tool with certain limitations.
Developing software for the energy sector, government services, and large enterprise clients is an environment where the cost of mistakes is high and the speed of delivering new functionality is critical. Our engineering team had been monitoring AI-powered developer tools for quite some time, but we were only prepared to move from interest to systematic adoption once we could objectively measure the return on investment (ROI).
We needed data-driven answers to three key questions rather than relying on marketing promises:
We designed the implementation as a phased project together with our partner, Microsoft. From an architectural perspective, the integration covered the two platforms where our codebase resides: Azure DevOps and GitHub Cloud.
Phase 1. Onboarding. Connecting the team to GitHub Enterprise, conducting GitHub Copilot Fundamentals workshops, and configuring security and access policies.
Phase 2. Measurement and ROI assessment. This was the most important phase for us. Together with our partner, we built a metrics framework based on the GitHub API to track active and engaged users, the volume of accepted suggestions, the share of time saved on routine development tasks, and the estimated financial impact. Rather than relying on developers’ subjective impressions, we created a dashboard that provides a clear view of actual team behavior and usage patterns day by day.
Phases 3 and 4 are now being driven by our internal team and focus on expanding adoption, integrating Copilot into standard workflows, and customizing its use for project-specific requirements.
The key KPIs we monitor throughout these phases include: Developer Satisfaction, Code Acceptance Rate (the percentage of AI-generated suggestions accepted by developers), Time to Completion, Code Quality.
Our Phase 4 targets are:
We started with 44 developer licenses across nine projects. During the 86-day measurement period, the dashboard tracked everything from the number of generated suggestions to the hours redirected toward productive coding work.
Key metrics
The final figure deserves additional context. It does not represent cash savings sitting in a bank account. Rather, it reflects the value of the time returned to the team — time that can now be invested in new initiatives and higher-value work instead of routine development tasks.
The most significant and obvious impact we observed was in unit testing.
One of our backend developers used Copilot throughout the entire process of writing tests for an internal electronic signature microservice. His assessment: the process became 40–50% faster, primarily due to the generation of initial test scaffolding and standard scenarios. Copilot handles the creation of test structures, basic logic, and catches minor typos, while the developer remains responsible for correctness, edge cases, and overall quality.
“Today, we use AI across virtually every stage of the SDLC, not just in isolated cases — from requirements validation and task decomposition to feature implementation, test generation, code reviews, documentation, and quality analysis. For us, this marks a shift from simply accelerating routine operations to adopting an AI-first engineering approach, where AI is no longer an external assistant but an integral part of the day-to-day engineering process.
However, it is important to emphasize that value comes not from the tool itself, but from the maturity of the team and its ability to use it effectively. AI significantly amplifies engineers who understand the strengths and limitations of modern AI engineering tools, know how to improve agent skills for project-specific needs, work consciously with prompts, and recognize how even a small change in context can improve or degrade the outcome.
At the same time, critical evaluation of code, architectural decisions, security, and quality remains the responsibility of humans.
To develop this approach systematically, we have established a dedicated AI Engineering SWAT team that evaluates which tools, practices, prompts, and agent skills truly work across different delivery scenarios and which require further refinement. This enables us not only to expand AI adoption but also to implement targeted improvements, embed successful practices into engineering standards, and scale them where they produce measurable results.
That is why we do not view GitHub Copilot and other AI tools as replacements for developers. For us, they are a way to give engineers more time for higher-value work — architecture, complex business logic, product quality, and customer outcomes. The next step is to scale AI engineering practices throughout the delivery process wherever they create the greatest impact.”— Artem Andrusenko, Head of Software Development, MODUS X
Another backend developer testing Copilot on a mobile backend observed the opposite pattern: large, generic requests tend to produce inconsistent results. When asking Copilot to generate tests in bulk for a large class, the resulting code often requires more effort to refine than writing it from scratch. In contrast, smaller and more targeted prompts — such as “write a test for this specific scenario with these input parameters” — consistently deliver strong results.
The same developer identified another important distinction that should be considered at the project level. If the objective is simply to increase test coverage as quickly as possible, Copilot performs exceptionally well and can achieve results within hours. However, if the goal is to create high-quality tests that genuinely protect against regressions, the AI assistant remains helpful but no longer provides the same level of acceleration. Achieving that outcome requires targeted prompts, manual refactoring, and engineering judgment.
This highlights an important point: test coverage percentage and test quality are not the same metric, and Copilot accelerates them to different degrees.
This aligns with observations across our broader engineering team: an AI assistant is not a machine that replaces engineering thinking. It amplifies the capabilities of those who can effectively decompose problems and formulate precise requests.
Another area we intend to explore further is Copilot’s performance in a Test-Driven Development (TDD) environment. Our hypothesis is that if tests are written first and developed through small iterations, the AI assistant may perform even better than it does when retrospectively generating tests for existing code. We are also interested in evaluating its effectiveness for integration testing, where the specifics of our architecture play a much greater role.
If we summarize what we observed over these 86 days:
The economics work even with moderate adoption. With only 36.5% active usage among licensed developers, we are already seeing projected annual savings of USD 85,000. As adoption expands, the ROI curve grows even faster.
Quality has not declined. Engagement levels of 80–90% indicate that active users have genuinely integrated Copilot into their daily workflows rather than experimenting with it once and abandoning it.
The greatest gains come from routine and repetitive tasks, such as unit tests, boilerplate code, standard scenarios, and documentation.
AI does not replace engineers — it reallocates their time. We see a reduction in non-coding activities and an increase in the share of time invested directly in software development.
The next phase is to move beyond the current 36.5% adoption rate. Our plan consists of three steps:
Expansion. Active engagement with licensed users who have not yet fully adopted the tool through mentoring sessions, knowledge sharing within teams, and success stories from the most active users.
Integration into standards. Embedding AI practices into code review guidelines, onboarding programs for new developers, engineering documentation, and project-specific prompt templates.
Deeper metrics. Connecting Copilot metrics to business outcomes, including release velocity, defect rates, and lead time from commit to production.
We see AI assistants for software development not as a one-time experiment, but as a long-term shift in the engineering workflow. At this stage, the quality of integration into mature development processes is more important to us than the speed of adoption itself.
“AI in software development is no longer an experiment — it is becoming part of the engineering operating model. The question is no longer whether organizations should use these tools, but how systematically they can integrate them into their processes and convert productivity gains into real business value.
At the same time, the structure of engineering teams is evolving. The value of professionals who combine deep technical expertise with the ability to effectively leverage AI continues to grow, while team composition is gradually shifting toward smaller, highly productive teams.”— Oleksii Vyhodskyi, Practice Director, Enterprise Applications and Technologies, MODUS X