By continuing to use our website, you consent to the use of cookies. Please refer our cookie policy for more details.

    AI-Augmented QA: Building Continuous Quality into Modern Engineering


    In a high-scale streaming environment, a bad release isn’t just a bug. It’s a live failure seen by millions of people in real-time.

    Most engineering teams still treat quality as something that sits alongside development rather than within it. But as we move toward AI-driven pipelines and continuous deployments, that gap is becoming impossible to ignore. The limitations of end-stage quality checks are becoming increasingly clear.

    As that gap widens, the old way of doing QA begins to break. Today’s systems are probabilistic, generating varied outputs that require a shift from binary ‘pass/fail’ thinking to confidence thresholds and behavioral consistency.

    What this shift looks like in practice is less about theory and more about how quality engineering teams operate. It’s about moving toward a model where quality is a continuous signal—driven by agentic testing, AI-assisted automation, and real-time observability.

    In this conversation, Gregory Goldshteyn breaks down how that change is taking shape, drawing on the high-velocity world of streaming to explore how to build a resilient, quality-first architecture for the AI era.

    Meet the Expert

    Gregory Goldshteyn LinkedIn

    QA Leader, AI in Testing, DevOps Strategist

    *Visionary QA Leader with substantial experience in the IT industry*

    Throughout my career—from Salesforce and Sony to my current role at FOX—I have been obsessed with one question:

    How do we deliver flawless digital experiences at the speed of modern business?


    My years of experience in quality test engineering have equipped me to focus on finding issues and fostering a quality-centric mindset among teams.Included in Bristol Who’s Who exclusive international registry that honors and recognizes top professionals for excellence within their industries.

    Q1: With AI-driven and probabilistic systems becoming mainstream, how should QA evolve beyond traditional validation approaches?

    Traditional QA was built around deterministic systems — you give it input A, you expect output B, every time. AI flips that contract. When I work with probabilistic systems, I shift from pass/fail binary thinking to confidence thresholds, output distributions, and behavioral consistency over time. At Fox, as we’ve integrated AI-driven features into our streaming platforms for personalization, recommendations, and content moderation, QA had to evolve alongside them. We moved toward property-based testing, golden dataset validation, and statistical sampling rather than exhaustive assertions. We also introduced bias and fairness checks as first-class test criteria. The mindset shift is big: instead of asking ”did it return the right answer?”, we ask “is it returning good answers quite often, and failing gracefully when it doesn’t?”

    Q2: You’ve spoken about agentic and GenAI-led testing—where do you see these making the biggest impact today?

    The biggest immediate impact I’ve seen is in test generation and maintenance, areas that have historically burned enormous QA cycles. GenAI can analyze a user story, generate a meaningful test suite, and flag edge cases a human might overlook, all in minutes. Agentic testing goes further: autonomous agents that can navigate a UI, discover broken flows, and self-heal test scripts when the UI changes. At a company like Fox with complex, high-traffic streaming infrastructure, that means less time maintaining brittle automation and more time doing strategic quality work. The second big impact is in log analysis and anomaly detection by using LLMs to interpret thousands of log lines and surface the ones that actually matter during an incident.

    Q3: In high-scale, high-velocity environments like streaming, what are the risks of treating QA as a final checkpoint, and what does a resilient quality architecture look like instead?

    This is something I feel strongly about, having worked in live streaming environments where a bad release can affect millions of viewers instantly. When QA is bolted on at the end, by the time issues are caught, the cost to fix them is exponential.


    The risks are significant: silent failures slipping through due to limited coverage, performance regressions that only appear under real load, and a false sense of confidence where QA sign-off is rushed. The deeper issue is cultural; teams start to see quality as “QA’s job” rather than a shared responsibility.


    A resilient quality architecture flips this model. In high-scale streaming environments, quality has to be distributed across every stage of delivery, not concentrated at the end. At Fox, where a flawed release can reach millions within minutes, quality is built into the pipeline.


    Practically, that means three things: shifting checks left into CI/CD so issues are caught early; using intelligent test prioritization to ensure the right coverage at speed rather than slow, exhaustive testing; and treating production observability as a live quality signal, not just an operational concern. Real user behavior will always showcase issues that pre-production environments can’t.


    Ultimately, confidence in a release comes from continuous signals across the entire pipeline.

    Q4: How can teams reduce regression cycles while still maintaining confidence in releases?

    The answer lies in intelligent test prioritization combined with risk-based coverage. You don’t need to run everything every time—you need to run the right things. I’ve built strategies around mapping code changes to affected test domains, so a change in the authentication service doesn’t trigger a full regression of the content catalog.


    On top of that, using historical failure data to prioritize tests that are most likely to spot real bugs can dramatically reduce cycle time without sacrificing meaningful coverage. Parallelization in CI pipelines is table stakes at this point.


    Confidence comes from transparency: clear dashboards, well-defined quality gates, and a shared understanding of what “coverage” actually means for a given release.

    Q5: What does a “quality-first” culture look like in modern DevOps and AI-driven pipelines?

    A quality-first culture isn’t about having a dedicated QA team that gates every release—it’s about making quality a shared engineering value, baked into every stage of the pipeline. In modern DevOps and AI-driven environments, that means a few things look fundamentally different.


    First, quality ownership is distributed: developers write and maintain tests as part of their definition of done, and no one ships something “to QA” the way they used to. Second, the pipeline enforces quality automatically—static analysis, security scans, contract tests, and model evaluation checks all run without anyone needing to remember to trigger them. Third, observability is treated as a quality signal, not just an ops concern.


    When running AI-driven pipelines, production monitoring and real-time feedback loops are how you know the system is behaving correctly, because pre-production coverage alone can never catch everything. At Fox, we’ve invested heavily in making quality metrics visible and actionable—tracking flakiness rates, coverage regressions, and model drift in the same dashboards where engineers track deployment health.


    Culture follows visibility. When teams can see the quality state of the system at all times, they start caring about it more.

    Q6: If you had to prioritize one shift in QA strategy for 2026, what would it be?

    The shift I’d prioritize is moving from reactive quality to proactive quality engineering — specifically, treating quality as a continuous signal rather than a phase or a gate. Most organizations still operate with a mental model where quality is something you “check” before release. That model doesn’t hold up when you’re shipping multiple times a day, running AI features that don’t have deterministic outputs, and operating at the scale of a major streaming platform. The organizations that are winning on quality in 2026 are the ones that have built systems where quality data flows continuously — from production observability, from automated evaluation pipelines, from real-user feedback loops — and where that data directly drives engineering decisions. Concretely, this means investing in AI-assisted test generation, evaluation frameworks for model behavior, and observability-as-quality infrastructure. At Fox, one of the most impactful things we’ve done is close the loop between production incidents and test coverage — when something breaks in production, we automatically generate regression coverage for that scenario, so the failure surface shrinks over time. That kind of continuous improvement flywheel is what I’d encourage every QA team to build toward in 2026.

    Conclusion

    If your business recognizes quality as a continuous signal rather than a final hurdle, you’re already ahead of the curve.


    Thanks to Gregory Goldshteyn for sharing his perspective on how software testing is evolving. The distinction he makes is vital: it separates organizations that are just automating old, slow processes from those that are building resilient, self-healing infrastructure.


    The reality is that in an AI-driven world, you can’t test your way to perfection at the end of a cycle. If the feedback loops between production and development aren’t automated and instant, the gap between a release and a failure will only continue to shrink.


    For most teams, the next step is to look at their ownership model and make quality assurance everyone’s job from the first line of code.

    Explore More Expert Conversations

    cj

    In Conversation With

    Charanjeev Shelly

    Director of Delivery, Digital Engineering

    Unified Commerce Strategy Through the Lens of Digital Engineering Leadership

    Bassem Marji

    In Conversation With

    Bassem Marji

    Senior Systems Integration Specialist, BLOM Bank

    The Enterprises That Win Design for Evolution

    george_expert

    In Conversation With

    Gregory Goldshteyn

    QA Leader, AI in Testing, DevOps Strategist

    AI-Augmented QA: Building Continuous Quality into Modern Engineering

    Bassem Marji

    In Conversation With

    Nicholas Fiorendi

    Senior Manager – Business Platforms, Schwarz Digits

    Salesforce and the Evolution of Enterprise Technology in the Agentic Era

    Bassem Marji

    In Conversation With

    Joshua Zerkel

    Head of Marketing and Community, Gradual

    From Community-Led Growth to Community-Integrated GTM

    Want to Share Your Perspective with Our Audience?

    We are always looking to collaborate with industry experts who have strong viewpoints on digital transformation, customer experience, marketing technology, and emerging trends. If you’d like to be featured in our Expert Insights series, we’d love to hear from you.