IdeaboxAIIdeaboxAI Docs

Knowledge Bases

Centralize structured and unstructured data for Agentic BI, AI agents, and automation workflows

Introduction

Knowledge Bases (KBs) in IdeaboxAI enable you to centralize structured and unstructured data for Agentic BI, AI agents, and automation workflows. This guide walks you through the end-to-end lifecycle—from creation to querying—empowering you to transform your data into actionable intelligence.

Getting started

Access the knowledge bases dashboard

Navigate to Knowledge Bases in the sidebar to view your workspace's knowledge base dashboard. Here you'll find:

  • A searchable table of all existing knowledge bases
  • KB metadata including name, status, type, creator, and last modified date
  • The + Create Knowledge Base button to start building a new KB
Knowledge Base dashboard showing list of KBs

Actions you can perform:

  • Navigate to the Knowledge Bases section
  • Search for an existing knowledge base
  • Initiate creation of a new knowledge base

💡 Tip: Use clear and consistent naming conventions for knowledge bases to help your team quickly identify the right data source for each task.

Best practices:

  • Periodically review the Last Modified column to track stale knowledge bases
  • Limit KB creation to meaningful datasets to avoid fragmentation
  • Document the purpose and scope of each KB clearly

Creating a knowledge base

Select the knowledge base type

When you click + Create Knowledge Base, you'll see three KB types to choose from:

  1. Unstructured Data - PDFs, CSVs, images, text documents
  2. Structured Data - Databases such as MySQL, PostgreSQL
  3. Financial Documents - Specialized PDF financial data
Knowledge base type selection modal

The system routes you to a configuration flow specific to the selected KB type.

⚠️ Warning: Choose Structured Data only when relational integrity and schemas exist. Use Unstructured Data for documents intended for semantic search or AI reasoning.

Best practices:

  • Avoid mixing heterogeneous data types in a single KB
  • Select the type that best matches your data format and use case
  • Consider how agents will query the data when making your selection

Configure knowledge base metadata

After selecting the type, you'll see a form requesting:

  • Knowledge Base Name (required) - A unique, descriptive identifier
  • Description - Explain the purpose and scope of the KB
  • Tags - Add keywords for domain, team, or project association
Metadata configuration for unstructured KB Metadata configuration for structured KB

Click Create to initialize your knowledge base.

Best practices:

  • Write concise but descriptive KB descriptions
  • Use tags for easy discovery and organization
  • Avoid renaming KBs after downstream integrations are established

Working with unstructured data

Ingest data from multiple sources

After creating an unstructured knowledge base, you'll see an empty Data Source panel prompting you to add data. Click Add Data to access multiple ingestion options:

  • Upload from Device - Upload local files
  • Add Markdown/Text - Create or paste text content
  • Upload from Google Drive - Connect your Drive account
  • Upload from Confluence - Import Confluence pages
Unstructured data ingestion options

Actions you can perform:

  • Upload files such as PDFs, CSVs, and text documents
  • Connect external sources (Drive, Confluence)
  • Batch upload multiple files at once

💡 Tip: Ensure documents are machine-readable (avoid scanned PDFs when possible) for better processing results.

Best practices:

  • Upload logically related documents together
  • Re-index data after bulk uploads
  • Validate that files are accessible and not corrupted before uploading

Monitor data processing status

Once files are uploaded, they enter a processing state. You'll see a tabular list with columns:

  • File Name - The name of the uploaded file
  • Status - Processing / Processed
  • File Type - PDF, CSV, TXT, etc.
Data processing status view for unstructured KB

Actions you can perform:

  • Monitor ingestion and processing status
  • Trigger manual re-indexing if required
  • Remove files that failed processing

⚠️ Warning: Wait for all files to reach Processed status before querying. Investigate repeated processing failures promptly.

Once processed, data becomes searchable and usable by your agents.

Working with structured data

Connect your database

For structured knowledge bases, you'll start by connecting to your external data source. Enter the following credentials:

  • Host - Your database server address
  • Port - Connection port (e.g., 5432 for PostgreSQL)
  • Database Name - The specific database to connect to
  • Username & Password - Authentication credentials
  • SSL Mode - Security settings for the connection
Connect external data source dialog Enter database credentials for structured KB

After a successful connection, IdeaboxAI imports your database schema.

Explore the datasets view

Once connected, you'll see a schema visualization showing:

  • Database tables (Data Sets) listed in the sidebar
  • Table relationships and joins
  • Options to enhance with AI, add semantics, or define relationships
Datasets view showing database schema

Actions you can perform:

  • Browse database tables
  • Define joins and relationships
  • Enhance schema metadata using AI assistance
  • Document business logic for columns and tables

Best practices:

  • Validate relationships against source database constraints
  • Avoid circular or ambiguous joins
  • Document semantic definitions clearly

The result is a semantically enriched data model suitable for analytics and AI queries.

Creating and querying cubes

Understanding cubes

Cubes are analytical models built on top of your structured data. They define:

  • Measures - Metrics you want to calculate (e.g., total revenue, average price)
  • Dimensions - Attributes to group and filter by (e.g., date, category, region)
  • SQL Logic - The underlying query that generates the cube data

Access the cube configuration interface

In the structured knowledge base view, you'll see:

  • A list of existing cubes on the left panel
  • Measures and dimensions associated with the selected cube
  • Query execution controls and result preview on the main panel
Cube configuration interface

Click the Add Cube button to initiate cube creation.

💡 Tip: Ensure the underlying datasets are already validated and available before creating cubes. Follow consistent naming conventions for cubes to simplify discoverability.

Create a new cube

Add Cube

The Add New Cube modal provides two approaches:

Let AI automatically generate the base SQL, measures, and dimensions using natural language:

  1. Cube Name - A unique, descriptive identifier (e.g., ITEM_VENDOR_SUMMARY)
  2. Cube Description - A concise explanation of the cube's analytical purpose
  3. Natural Language Query - Describe the desired analytics logic in plain language
  4. Business Context - Provide domain context to improve relevance and correctness
Data modeling with CubeJS using AI flow

The AI analyzes your datasets and business context to generate optimized SQL and relevant measures/dimensions.

Manual Query

For advanced users, directly author the SQL query:

  1. Cube Name - Unique identifier
  2. Base SQL Configuration - Write custom SQL for the cube
Data modeling with CubeJS using manual flow

⚠️ Warning: Prefer AI Generate for rapid prototyping and exploratory analytics. Use Manual Query for performance-critical or highly customized analytical logic.

Best practices:

  • Clearly document business intent in the description and context fields
  • Test generated SQL before deploying to production
  • Start with simple cubes and iterate based on user feedback

Execute queries and access results

Once your cube is created, you can query it using the cube selection panel:

Select your analysis parameters:

  1. Choose Measures - The metrics you want to calculate
  2. Choose Dimensions - How to group or filter the data
  3. Apply Filters - Narrow down the results

Click Run Query to execute.

Cube selection panel with measures and dimensions

View results in multiple formats:

The interface provides several tabs for different use cases:

  • Results - Interactive table or chart view
  • Generated SQL - See the SQL that powers your query
  • SQL API - Endpoint for SQL-based integrations
  • REST API - RESTful endpoint for application integration
  • GraphQL API - GraphQL endpoint for flexible data fetching

Best practices:

  • Start with limited dimensions to optimize performance
  • Reuse generated APIs instead of re-running UI queries
  • Validate results against source systems for critical analytics
  • Export and share query results with your team

Integrating with agents

Your knowledge bases become even more powerful when connected to AI agents. Agents can:

  • Query unstructured documents to answer questions
  • Run analytical queries on structured data cubes
  • Combine multiple knowledge bases for comprehensive insights
  • Cite specific sources when providing answers

To connect a knowledge base to an agent, navigate to the agent's configuration and select your KB from the dropdown menu. For detailed instructions, see the Agents documentation.

Next steps

  • Agents — Learn how to create AI agents that leverage your knowledge bases
  • Agentic BI — Explore advanced business intelligence capabilities

On this page