Title: Building the NVAI Pipeline

Summary:
Canvas, Genesis, and Mastery Connect each live in their own silo. Manuel Escobar Rodriguez, Jake Ellengold, and Daniel Winter built the pipeline that pulls them together into a single source of truth for an LLM. How three students unified three independent data systems so a school district's AI can actually answer questions.

Key Ideas:
1. The Pipeline Problem
2. MISO: Keeping Outputs Honest
3. What It Adds Up To
4. When you build an AI service for a school district, you quickly run into a problem that has nothing to do with the model itself: the data is scattered across s…
5. NVAI pulls from three of them — Canvas, Genesis, and Mastery Connect.

Permalink: https://aiaieducation.org/blog/nvai-pipeline

Full Post Body:

# Building the NVAI Pipeline

<div className="aspect-video rounded-2xl overflow-hidden my-8">
  <iframe
    src="https://www.youtube.com/embed/NeBGrBOdvmU"
    title="NVAI Pipeline Overview"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowFullScreen
    className="w-full h-full"
  />
</div>

When you build an AI service for a school district, you quickly run into a problem that has nothing to do with the model itself: the data is scattered across systems that were never designed to talk to each other.

NVAI pulls from three of them — Canvas, Genesis, and Mastery Connect. Each handles a different slice of a student's academic life, and each is completely independent. Canvas tracks coursework and assignments. Genesis holds attendance and grade records. Mastery Connect measures standards-based proficiency. A teacher asking a question about a student needs all three. The LLM powering NVAI does too.

## The Pipeline Problem

The challenge isn't fetching data — it's that three independent systems return data in three different shapes. Before any of it reaches the model, it has to be extracted from each service separately and then normalized into a single source of truth. That normalization step is the core of the pipeline. Without it, the LLM would be reasoning across inconsistent formats, filling in gaps with assumptions, and producing answers that are plausible but not grounded.

The pipeline ensures that what the model receives is clean, consistent, and complete — so that what the user receives is actually useful.

## MISO: Keeping Outputs Honest

Data quality going in is only half the problem. To ensure quality coming out, the pipeline uses a layer called MISO — Model Independent System Output improvement tools. Rather than modifying the model itself, MISO adjusts how data flows through the pipeline before the model ever sees it. The goal is simple: every response should be true and faithful to the source data, not a confident-sounding extrapolation.

The "model independent" part matters. MISO works at the data layer, which means it's not tied to any particular model. Swap the LLM and the output quality controls stay in place.

## What It Adds Up To

The result is a system that is robust and unified — two things that are hard to achieve simultaneously when your inputs come from different vendors, different schemas, and different update cycles. More practically, it means a teacher can ask a question and trust the answer, because the pipeline has already done the work of reconciling three different systems into one coherent response.

That's the job: not to build the model, but to build the infrastructure that makes the model worth using.