Data Source Configuration
This section discusses operations you can perform on data source objects or fields, such as anonymization and creating custom recognizers.
To Ingest data with an Aisera application:
Select Settings > Data Sources command to navigate to the Data Source Details page.
Choose an application that is integrated with the data source that you want to ingest, or click + New Data Source (that includes documents with the newly-supported file types) and associate it with an existing Aisera application.
Select the arrow/triangle in the upper-right section of the Data Source Details page to start the data ingestion job.
The Data Source Details window displays the metrics for the data as it is ingested. You will see the functions that you selected while creating the data source (User Learning in this example) and details of all the integration runs.
The Data Ingestion function supports .txt, .html, .md, .pdf, and .ppt file types.
Then, when you create your application/bot, you can choose this Data Source from the list of available sources, after you select + Add Data Source on your application's Detail (summary) page. For a detailed diagram, see Integrations and Data Sources.
Auto-Commit Setting Runs Index Jobs
You can schedule data source ingestion updates to your tenant. If the auto-commit flag is set to true
in your Data Source Configuration, then the content is automatically approved and ingested.
In releases prior to 5/7/2025, the ingested data was not used in Knowledge Base Article responses until you ran the Neural search and RAG indexing jobs after the data ingestion.
After 5/7/2025, when the data source is updated, the Aisera Gen AI platform determines all applications/bots using this data source and automatically triggers search indexing jobs for those applications/bots. After the jobs are completed, the content is published and appears in live results.
Set Recurring Knowledge Generation Jobs
You can now set a schedule for recurring Knowledge Generations Jobs.
Previously, you could set the knowledge generation job to run when the number of tickets reaches a specified threshold (because more tickets create more accurate knowledge generation).
Now you can set a recurring schedule to ignore the threshold and run the knowledge generation job periodically (regardless of the number of tickets in your system). This allows you to review the results at specific times, instead of at random intervals.
To set the Recurring Schedule:
Navigate to Settings >Content Generation >Knowledge Generation.
Choose the Actions button in the upper-right corner of the Knowledge Generation window.
Select Set Recurring Schedule.

Pick Monthly, Bi-Monthly, or Quarterly as the recurring values.
Choose a Start Date from the Calendar option.
Set the Ticket Threshold option to Yes or No.
Set the Conditions, Field Assignments, and Pre-Generation Configuration.
Click the OK button.
Schedule Configuration Details
Frequency Selection: You can select a recurrence frequency (Monthly, Bi-Monthly, Quarterly).
Start Date: Specify the Start Date in UTC. The Start Date determines when the recurring KB generation job will begin. All past dates are disabled and you can choose future dates.
Ticket Threshold Setting: To ensure high-quality clustering, a minimum of ~40,000 tickets is recommended. Lower ticket volumes result in looser clusters with broader topics and less meaningful document generation. The Aisera Gen AI platform allows you to generate documents with minimal tickets because it is not realistic that every customer has 40K tickets for every configuration.
If Enabled:
On the scheduled job run date, the system checks the total ticket count.
If the ticket count is below the threshold, the knowledge base will not be generated. You can still view the job details in the job filter dropdown. Upon selecting it, the user will see a message: The KB generation did not run because the defined ticket threshold is 50K, while only 30K tickets were available at the time of execution.
If Disabled:
The system will ignore the ticket count threshold and process all available tickets on the job trigger date.
All other options — such as Ticket Conditions, Knowledge Field Mapping, and Pre-Generation Configuration — will remain the same as those available for a regular job run.
After you set the recurring option and return to the Set Recurring Schedule window, the Job will trigger on… field will be displayed below the Start Date (UTC). This value is dynamic. For example, if the schedule is monthly and the job ran yesterday, the Job will trigger on field displays the next run date.
Wait After Setting Schedule or Job Configuration
Setting Knowledge Generation schedules at the Bot level, instead of at the tenant level gives you the ability to create different generation schedules for different bots.

However, this change means that you need to wait 10 minutes after you set up the Job Configuration, so the Directed Acrylic Graph (DAG) that gets created by the configuration can be created and associated with your bot.

After you’ve waited for the DAG to get created, you can click the Generate Knowledge button to begin the Knowledge Generation Job.
Post-Ingestion Indexing Tasks
There are post-ingestion tasks that you need complete before your ingested data is ready for use. Post-Ingestion tasks may include: Neural Search RAG indexing, Knowledge Article indexing, running Access Attribute Extraction jobs for User data, or running Discovery Ontology Indexing for Ticket data.
After your data is ingested, you need to run an Indexer job before you can use the AI Learning or Content Generation features on your application or bot.
See Post-Ingestion Tasks for more details.
Last updated
Was this helpful?