Your location:Home>开发控件 版本控制 >开发控件

GroupDocs.Parser for .NET

GroupDocs.Parser for .NET

 

GroupDocs.Parser for .NET is a fascinating document text extraction API. It extracts text and metadata from Microsoft Word, Excel, PowerPoint, email messages, container files that contain other files like ZIP archives, plain text files and HTML without any of these document reader installed. Text extractor API performs operations with unprecedented accuracy and speed. API also provides convenient tools to detect encoding such as UTF32 LE, UTF32 BE, UTF16 LE , UTF16 BE and more

 

An overview of .NET Text extraction API for documents raw and formatted text retrieval.

 Features
  • Extract Raw Text

  • Extract Formatted Text

  • Extract Metadata

  • Encoding Detection

  • Media Type Detection

  • Extensible & Flexible

 The API
  • Gets Input File

  • Fetches Raw or Formatted Text

  • Fetches Metadata

 

Advanced Document Text Extraction API Features

Extracts raw and formatted text

Extracts metadata

Extract structured text

Extract highlighted text

Search text in documents

Fetches text from containers containing other files such as zip archives

Gets formatted text from TXT, Markdown and HTML files

Support for encoding detection

Support for media type detectors

北京哲想软件有限公司