Skip to main content
Version: 6.0.0

Files Connector

Overviewโ€‹

The Files Connector is a straightforward but powerful tool that enables you to upload and extract content from various document formats. Whether you need to incorporate PDF manuals, Word documents, spreadsheets, or plain text files into your knowledge base, this connector makes the process simple and efficient. Uploaded files become searchable and accessible through our products ACE Search and Chat.

Table of Contentsโ€‹

Supported File Typesโ€‹

The Files Connector supports various document formats, with particular emphasis on:

File TypeExtensionDescription
Excel Spreadsheets.xlsxMicrosoft Excel workbooks (specifically mentioned in transcript)
PDF Documents.pdfAdobe Portable Document Format files
Word Documents.docx, .docMicrosoft Word documents
PowerPoint Presentations.pptx, .pptMicrosoft PowerPoint slides
Text Files.txtPlain text documents
HTML Files.html, .htmWeb page documents
JSON Files.jsonJavaScript Object Notation files
CSV Files.csvComma-separated values files

Using the Files Connectorโ€‹

Basic Upload Processโ€‹

  1. Navigate to Knowledge Management > Add Knowledge
  2. Find and click on the "Files" option in the organization connector section
  3. Click the plus button to expand the options
  4. Provide a descriptive knowledge name
  5. Upload files using one of these methods:
    • Drag and drop files onto the upload area
    • Click the browse button to select files from your computer
  6. Wait for the green progress bar to reach 100% for each file
  7. Set group permissions by clicking "Group Permissions" and selecting the appropriate groups
  8. Click "Save" to finalize and process the files

File Management Optionsโ€‹

During the upload process, you have several options:

  • Click the delete button to remove a file before saving
  • Click "Attach More Documents" to add additional files
  • Monitor the upload progress via the green progress bar

File Size Limitationsโ€‹

  • Each file can be up to 100MB in size
  • Minimum of two files can be uploaded
  • The maximum number of files can be configured in application settings
    • This limit can be adjusted by admins or super admins
    • Found in Application Settings > File Upload Limits

Working with Added Filesโ€‹

After saving your file knowledge:

  • You'll be redirected to "Existing Files" section
  • Here you can view:
    • Knowledge name
    • Status (Enabled/Success)
    • Action options (Delete)
    • Review button

Review Processโ€‹

When you click the "Review" button:

  • You'll be redirected to the Manage Knowledge section for that specific file knowledge
  • Here you can see:
    • Knowledge name
    • Last indexed date
    • Configuration details (file paths)
    • Creator information (who added the knowledge)
    • Group permissions (which groups the knowledge is shared with)
    • Indexing attempts with timestamps

Monitoring Processing Statusโ€‹

To check the processing status:

  1. In the Indexing Attempts section, look for "Time Started" information
  2. Click "View Logs" to see detailed backend processing information
  3. A popup will appear showing all processing details
  4. When status changes from "Enabled" to "Success," your knowledge is ready to use

Knowledge Objectsโ€‹

After successful processing, you can access:

  • All extracted file content in the Knowledge Objects section
  • Each file will have its own entry with extracted data
  • This content is now searchable through the search functionality

Reindexing Using Visionโ€‹

The Files Connector offers an advanced feature called "Reindex using Vision" that leverages multimodal AI to extract more detailed information from uploaded documents by processing them as images.

Accessing the Reindex Using Vision Featureโ€‹

  1. Navigate to Knowledge Management > Manage Knowledge
  2. Click on the knowledge name for your file knowledge
  3. Go to the Knowledge Objects tab
  4. In the Knowledge Objects table, locate the Actions column
  5. Click the Preview button for the document you want to reindex
  6. In the Preview Data popup, you'll find a button labeled Reindex using Vision
  7. Click this button to initiate the vision-based reindexing process

How Reindex Using Vision Worksโ€‹

When you activate this feature, the system performs the following steps:

  1. Initiates the vision loader and loads the embedding model
  2. Converts your document (e.g., PDF) into a series of images
  3. Processes these images through a multimodal AI model
  4. Extracts more detailed information from visual elements in the document
  5. Creates a new, enhanced index of the document content

Monitoring the Reindexing Processโ€‹

After initiating vision-based reindexing:

  • A new job will appear in the Indexing Attempts section
  • You can monitor progress through the logs by clicking the View Logs button
  • The logs will show steps like:
    • "Initiating vision loader"
    • "Loading embedding model"
    • "Starting convert PDF to images"
    • "Complete conversion of PDF to images"
    • "Extracting using vision"

Benefits of Vision-Based Reindexingโ€‹

This feature provides several advantages:

  • Enhanced Content Extraction: Captures information that text-only processing might miss
  • Improved Visual Element Processing: Better handles documents with charts, diagrams, and tables
  • More Comprehensive Indexing: Creates a more complete representation of document content
  • Better Search Results: Enables more accurate responses to queries about the document

When to Use Vision-Based Reindexingโ€‹

Consider using this feature for:

  • Documents with complex layouts or formatting
  • Content with significant visual elements like charts and diagrams
  • Scanned documents where text extraction is suboptimal
  • Technical documentation where visual precision is important

Note that vision-based reindexing is more computationally intensive and may take longer than standard indexing, especially for large documents with many pages.

Permission Managementโ€‹

Control who can access your uploaded files:

  • During upload, select the user groups that should have access
  • Multiple groups can be selected for broader access
  • Only users in the selected groups will be able to search and view this content
  • Permissions can be edited later through the knowledge management interface

Best Practicesโ€‹

  • Organize Related Files: Upload related documents together as a single knowledge source
  • Use Descriptive Names: Name your knowledge sources clearly for easy identification
  • Check File Quality: Ensure documents are properly formatted before uploading
  • Text Recognition: For scanned PDFs, use OCR (Optical Character Recognition) before uploading
  • Regular Updates: Replace outdated documents with new versions as needed

Troubleshootingโ€‹

IssueSolution
Upload failsCheck file size and format, ensure it's within limits
Content not extracted correctlyVerify the file isn't corrupted or password-protected
Text appears garbledEnsure the document uses standard encoding
Tables not properly processedConsider converting complex tables to simpler formats
Images missingNote that the connector primarily extracts text content

Content Processing Detailsโ€‹

Understanding how different file types are processed can help you prepare optimal documents:

PDF Documentsโ€‹

  • Text content is extracted page by page
  • Basic formatting information is preserved when possible
  • Scanned PDFs without OCR may not yield searchable text
  • Complex layouts may be simplified during extraction

Word Documentsโ€‹

  • Text, tables, lists, and basic formatting are preserved
  • Embedded images are noted but not fully processed
  • Comments and tracked changes can be included or excluded
  • Complex features like macros are not processed

Excel Spreadsheetsโ€‹

  • Cell values and formulas are extracted
  • Sheet names and structure are preserved
  • Charts are noted but not fully rendered
  • Cell formatting may be simplified

PowerPoint Presentationsโ€‹

  • Slide text and basic structure are extracted
  • Speaker notes can be included
  • Slide titles are used as section headings
  • Animations and transitions are not processed

FAQโ€‹

Q: Can I upload password-protected documents?
A: No, files with password protection cannot be processed. Remove protection before uploading.

Q: How are file updates handled?
A: You'll need to create a new knowledge source with updated files. The system doesn't automatically track versions.

Q: Can I delete files after they've been processed?
A: Yes, but you should delete the entire knowledge source through the management interface.

Q: How are very large documents handled?
A: Large documents are broken into smaller chunks for processing but will appear as a single document in search results.

Q: Can I upload executable files or scripts?
A: No, for security reasons, executable files (.exe, .bat, .sh, etc.) are not supported.

To get the most out of your uploaded files, consider using these related features:

  • Knowledge Sets: Group related file uploads into unified collections
  • FAQ Management: Create frequently asked questions based on document content
  • Result Ranking: Adjust search relevance for important documents
  • Knowledge Objects: View and manage individual content pieces extracted from documents

By following this guide, you can effectively use the Files Connector to incorporate your important documents into the knowledge management system, making their content searchable and accessible.