Skip to content
Ivan França – AI Engineering Notes
  • Home
  • Contact
  • About this blog
  • About me
Github Linkedin kaggle
Ivan França – AI Engineering Notes
Github Linkedin
  • Conceptual illustration of a Retrieval-Augmented Generation system analyzing movie plot data, showing text segmentation, vector database storage, and narrative structure analysis.
    AI Engineering | Data & Text Processing | Retrieval-Augmented Generation (RAG)

    RAG Movie Plots: Understanding Narrative Structure Before Building RAG Systems

    ByIvan França 03/03/202603/03/2026

    Introduction When building Retrieval-Augmented Generation (RAG) systems, it is tempting to focus immediately on embeddings, chunk sizes, vector databases and prompt design. However, segmentation and retrieval behavior are not independent engineering choices. They are constrained by the structure of the data itself. This article explores the structural characteristics of the Wikipedia Movie Plots dataset and…

    Read More RAG Movie Plots: Understanding Narrative Structure Before Building RAG SystemsContinue

  • Digital illustration of a modular Retrieval-Augmented Generation architecture with separated ingestion and generation layers.
    AI Engineering | Retrieval-Augmented Generation (RAG)

    RAG Movie Plots: Designing a Modular RAG System

    ByIvan França 14/02/202603/03/2026

    Introduction Retrieval-Augmented Generation (RAG) systems are often described as linear pipelines: load documents, split text, generate embeddings, store vectors, retrieve context and invoke a language model. This description is conceptually accurate, but architecturally incomplete. When presented as a sequence of steps, the structural decisions that shape system behavior remain implicit. Choices about segmentation, persistence, filtering…

    Read More RAG Movie Plots: Designing a Modular RAG SystemContinue

  • Abstract illustration of a broken data pipeline representing a silent failure in a RAG chunking process
    AI Engineering | Retrieval-Augmented Generation (RAG)

    When separator=”\n” Silently Breaks Chunk Overlap in RAG Pipelines

    ByIvan França 03/02/202603/03/2026

    Introduction Chunk overlap is widely treated as a reliable mechanism for preserving semantic continuity between adjacent chunks in Retrieval-Augmented Generation (RAG) pipelines. The intuition is straightforward: reuse part of the end of one chunk at the start of the next so the context isn’t artificially broken. In practice, this mechanism is rarely examined in detail….

    Read More When separator=”\n” Silently Breaks Chunk Overlap in RAG PipelinesContinue

© 2026 Ivan França - AI Engineering Notes

  • Home
  • Contact
  • About this blog
  • About me