Version: 6.0.0

Files Connector

Overview

The Files Connector is a straightforward but powerful tool that enables you to upload and extract content from various document formats. Whether you need to incorporate PDF manuals, Word documents, spreadsheets, or plain text files into your knowledge base, this connector makes the process simple and efficient. Uploaded files become searchable and accessible through our products ACE Search and Chat.

Files Connector

Supported File Types

The Files Connector supports various document formats, with particular emphasis on:

File Type	Extension	Description
Excel Spreadsheets	.xlsx	Microsoft Excel workbooks (specifically mentioned in transcript)
PDF Documents	.pdf	Adobe Portable Document Format files
Word Documents	.docx, .doc	Microsoft Word documents
PowerPoint Presentations	.pptx, .ppt	Microsoft PowerPoint slides
Text Files	.txt	Plain text documents
HTML Files	.html, .htm	Web page documents
JSON Files	.json	JavaScript Object Notation files
CSV Files	.csv	Comma-separated values files

Using the Files Connector

Basic Upload Process

Navigate to Knowledge Management > Add Knowledge
Find and click on the "Files" option in the organization connector section
Click the plus button to expand the options
Provide a descriptive knowledge name
Upload files using one of these methods:
- Drag and drop files onto the upload area
- Click the browse button to select files from your computer
Wait for the green progress bar to reach 100% for each file
Set group permissions by clicking "Group Permissions" and selecting the appropriate groups
Click "Save" to finalize and process the files

File Management Options

During the upload process, you have several options:

Click the delete button to remove a file before saving
Click "Attach More Documents" to add additional files
Monitor the upload progress via the green progress bar

File Size Limitations

Each file can be up to 100MB in size
Minimum of two files can be uploaded
The maximum number of files can be configured in application settings
- This limit can be adjusted by admins or super admins
- Found in Application Settings > File Upload Limits

Working with Added Files

After saving your file knowledge:

You'll be redirected to "Existing Files" section
Here you can view:
- Knowledge name
- Status (Enabled/Success)
- Action options (Delete)
- Review button

Review Process

When you click the "Review" button:

You'll be redirected to the Manage Knowledge section for that specific file knowledge
Here you can see:
- Knowledge name
- Last indexed date
- Configuration details (file paths)
- Creator information (who added the knowledge)
- Group permissions (which groups the knowledge is shared with)
- Indexing attempts with timestamps

Monitoring Processing Status

To check the processing status:

In the Indexing Attempts section, look for "Time Started" information
Click "View Logs" to see detailed backend processing information
A popup will appear showing all processing details
When status changes from "Enabled" to "Success," your knowledge is ready to use

Knowledge Objects

After successful processing, you can access:

All extracted file content in the Knowledge Objects section
Each file will have its own entry with extracted data
This content is now searchable through the search functionality

Reindexing Using Vision

The Files Connector offers an advanced feature called "Reindex using Vision" that leverages multimodal AI to extract more detailed information from uploaded documents by processing them as images.

Accessing the Reindex Using Vision Feature

Navigate to Knowledge Management > Manage Knowledge
Click on the knowledge name for your file knowledge
Go to the Knowledge Objects tab
In the Knowledge Objects table, locate the Actions column
Click the Preview button for the document you want to reindex
In the Preview Data popup, you'll find a button labeled Reindex using Vision
Click this button to initiate the vision-based reindexing process

How Reindex Using Vision Works

When you activate this feature, the system performs the following steps:

Initiates the vision loader and loads the embedding model
Converts your document (e.g., PDF) into a series of images
Processes these images through a multimodal AI model
Extracts more detailed information from visual elements in the document
Creates a new, enhanced index of the document content

Monitoring the Reindexing Process

After initiating vision-based reindexing:

A new job will appear in the Indexing Attempts section
You can monitor progress through the logs by clicking the View Logs button
The logs will show steps like:
- "Initiating vision loader"
- "Loading embedding model"
- "Starting convert PDF to images"
- "Complete conversion of PDF to images"
- "Extracting using vision"

Benefits of Vision-Based Reindexing

This feature provides several advantages:

Enhanced Content Extraction: Captures information that text-only processing might miss
Improved Visual Element Processing: Better handles documents with charts, diagrams, and tables
More Comprehensive Indexing: Creates a more complete representation of document content
Better Search Results: Enables more accurate responses to queries about the document

When to Use Vision-Based Reindexing

Consider using this feature for:

Documents with complex layouts or formatting
Content with significant visual elements like charts and diagrams
Scanned documents where text extraction is suboptimal
Technical documentation where visual precision is important

Note that vision-based reindexing is more computationally intensive and may take longer than standard indexing, especially for large documents with many pages.

Permission Management

Control who can access your uploaded files:

During upload, select the user groups that should have access
Multiple groups can be selected for broader access
Only users in the selected groups will be able to search and view this content
Permissions can be edited later through the knowledge management interface

Best Practices

Organize Related Files: Upload related documents together as a single knowledge source
Use Descriptive Names: Name your knowledge sources clearly for easy identification
Check File Quality: Ensure documents are properly formatted before uploading
Text Recognition: For scanned PDFs, use OCR (Optical Character Recognition) before uploading
Regular Updates: Replace outdated documents with new versions as needed

Troubleshooting

Issue	Solution
Upload fails	Check file size and format, ensure it's within limits
Content not extracted correctly	Verify the file isn't corrupted or password-protected
Text appears garbled	Ensure the document uses standard encoding
Tables not properly processed	Consider converting complex tables to simpler formats
Images missing	Note that the connector primarily extracts text content

Content Processing Details

Understanding how different file types are processed can help you prepare optimal documents:

PDF Documents

Text content is extracted page by page
Basic formatting information is preserved when possible
Scanned PDFs without OCR may not yield searchable text
Complex layouts may be simplified during extraction

Word Documents

Text, tables, lists, and basic formatting are preserved
Embedded images are noted but not fully processed
Comments and tracked changes can be included or excluded
Complex features like macros are not processed

Excel Spreadsheets

Cell values and formulas are extracted
Sheet names and structure are preserved
Charts are noted but not fully rendered
Cell formatting may be simplified

PowerPoint Presentations

Slide text and basic structure are extracted
Speaker notes can be included
Slide titles are used as section headings
Animations and transitions are not processed

FAQ

Q: Can I upload password-protected documents?
A: No, files with password protection cannot be processed. Remove protection before uploading.

Q: How are file updates handled?
A: You'll need to create a new knowledge source with updated files. The system doesn't automatically track versions.

Q: Can I delete files after they've been processed?
A: Yes, but you should delete the entire knowledge source through the management interface.

Q: How are very large documents handled?
A: Large documents are broken into smaller chunks for processing but will appear as a single document in search results.

Q: Can I upload executable files or scripts?
A: No, for security reasons, executable files (.exe, .bat, .sh, etc.) are not supported.

To get the most out of your uploaded files, consider using these related features:

Knowledge Sets: Group related file uploads into unified collections
FAQ Management: Create frequently asked questions based on document content
Result Ranking: Adjust search relevance for important documents
Knowledge Objects: View and manage individual content pieces extracted from documents

By following this guide, you can effectively use the Files Connector to incorporate your important documents into the knowledge management system, making their content searchable and accessible.

Overview​

Table of Contents​

Supported File Types​

Using the Files Connector​

Basic Upload Process​

File Management Options​

File Size Limitations​

Working with Added Files​

Review Process​

Monitoring Processing Status​

Knowledge Objects​

Reindexing Using Vision​

Accessing the Reindex Using Vision Feature​

How Reindex Using Vision Works​

Monitoring the Reindexing Process​

Benefits of Vision-Based Reindexing​

When to Use Vision-Based Reindexing​

Permission Management​

Best Practices​

Troubleshooting​

Content Processing Details​

PDF Documents​

Word Documents​

Excel Spreadsheets​

PowerPoint Presentations​

FAQ​

Related Features​