DocsCorp contentCrawler

contentCrawler for Bulk Image Conversion


contentCrawler is an integrated analysis, processing and reporting framework that intelligently assesses documents in a document management system for bulk processing.

Users can bulk process documents in the content repository using either the OCR or Compression modules. Or, they can do both. For example, contentCrawler will convert all image-based documents in the DMS to text-searchable PDFs. The Compression module will then apply compression and downsampling to all PDFs, reducing them in file size.

The automated end-to-end process can run 24/7 without any staff intervention, emailing periodic notifications of processing statistics and error reporting to the IT Administrator. Staff no longer have to worry about OCR or Compression as a process or workflow.

contentCrawler is available as an on-premises and cloud solution.


Key Features

  • Assesses and analyzes documents in a content repository for OCR and/or compression processing
  • Processes image-based documents such as TIF, JPG, PNG and image PDFs
  • Converts image-based documents to text-searchable PDFs adding a text layer for enhanced searching
  • Reduces image-based document file size using a variety of JPEG compression standards
  • Processes image-based attachments in emails
  • Set compression and text thresholds to optimize processing, ignoring documents that do not meet the requirements



contentCrawler integrates with a number of leading document management systems as well as a Windows file system:

  • File System
  • HP TRIM/Records Manager
  • iManage Work
  • MS SharePoint
  • MS SharePoint Online (O365)
  • NetDocuments
  • OpenText Content Server
  • OpenText eDOCS DM
  • OpenText LiveLink
  • ProLaw
  • Worldox


System Reqs

Operating Systems

  • Microsoft® Windows Server® 2016, 2012 R2 or 2012*, 2008 R2 SP1* or 2008 SP2*
  • MS .NET Framework 4.5/4.5.1

* Not supported on Server Core Role



  • 8 GB RAM
  • 100 GB free disk space
  • 1-2 GB per CPU core over 4*

* Recommended: 4 dedicated CPUs

 contentCrawler supports multi-core CPUs - 4, 8, 16 and 32 cores.


contentCrawler Insights

Save up to 240 hours a year per person in lost productivity looking for missing or invisible documents

contentCrawler can run on 4, 8, 16 or 32 CPU cores for faster processing. OCRs 2 pages per second on an 8 CPU core

contentCrawler finds 30% more documents than your document management search technology

Save up to 120 hours per year per person OCR’ing documents

Run fully-automated OCR processing 24/7, with no staff intervention needed

Can OCR up to 17,000 pages per day


Quick Navigation;

© Copyright 2000-2020  COGITO SOFTWARE CO.,LTD. All rights reserved