# Document Converter

### Document Converter

| **Type**    | Checkbox |
| ----------- | -------- |
| **Default** | Disabled |

Enables higher quality PDF conversions using a more advanced PDF conversion service. Automatically splits large PDFs into smaller parts, supports caching, provides higher accuracy table parsing, and supports OCR (Optical Character Recognition).

See also: [Accurate Table Parse](#accurate-table-parse), [Force OCR](#force-ocr), [Bypass Cache](#bypass-cache), [Pdf Names](https://docs.aisera.com/aisera-platform/tenant-setup/aisera-platform-configuration/tenant-configuration-settings/parser/..#pdf-names-1)

### Accurate Table Parse

| **Type**     | Checkbox                                                  |
| ------------ | --------------------------------------------------------- |
| **Default**  | Disabled                                                  |
| **Requires** | [Document Converter](#document-converter) must be enabled |

Enables more accurate table parsing and cell content extraction. Useful for documents that contain complex tables requiring precise extraction.

When enabled and **Pdf Names** is empty, accurate table parsing applies to all PDFs. When **Pdf Names** is specified, only the listed PDFs are processed using accurate table parsing.

### Bypass Cache

| **Type**     | Checkbox                                                  |
| ------------ | --------------------------------------------------------- |
| **Default**  | Disabled                                                  |
| **Requires** | [Document Converter](#document-converter) must be enabled |

Ensures PDFs undergo a new conversion for each request, bypassing cached versions. This benefits frequently updated documents but may slow access due to the conversion process running on every retrieval.

### Force OCR

| **Type**     | Checkbox                                                  |
| ------------ | --------------------------------------------------------- |
| **Default**  | Disabled                                                  |
| **Requires** | [Document Converter](#document-converter) must be enabled |

Applies Optical Character Recognition (OCR) to all documents during conversion. Useful for documents without embedded text, such as images or scanned files. Alternatively, you can specify a list of files to force OCR on using the **Pdf Names** field.

When disabled, the system checks whether a PDF is image-only. If the PDF contains only images, OCR is applied; otherwise OCR is skipped.
