010-68421378
sales@cogitosoft.com
Your location:Home>News Center >Industry News

Why MapR Data Science Refinery?

latest update:2018/05/15 Views:788
Why MapR Data Science Refinery? Enable More Accurate Insights with Access to All Data The MapR Data Science Refinery is ...

Why MapR Data Science Refinery?

Enable More Accurate Insights with Access to All Data

The MapR Data Science Refinery is the only data science offering with secured access to all data. It connects out of the box with:

MapR-XD: for files and containers

Globally distributed data store

High-scale and reliable

 

MapR-DB: a highly scalable, multi-model, NoSQL database management system

Supports multiple data models, including wide-column, document, key value, and time-series

MapR-ES: global publish-subscribe event streaming system

The first big data-scale streaming system built into a converged data platform

The only big data streaming system to support global event replication reliably at IoT scale

 

Create Real-Time Machine Learning Pipelines

A core component of the MapR Platform, MapR-ES is a global publish-subscribe event streaming system for big data. With native integration between MapR-ES and machine learning libraries, organizations can now create real-time machine learning pipelines, allowing them to apply ML models to real-time data.

 

Increase Data Science Productivity with Broad Language and Library Support

The MapR Data Science Refinery offers the Apache Zeppelin Data Science Notebook to provide the ability to work across many engines in one visual space:

Distributed Compute and ML programming with Apache Spark & Python

Batch and Interactive SQL with Apache Hive and Drill

Scripting support for Apache Pig

Shell access to MapR-FS

Programmatic access to MapR-DB and MapR-ES, using Spark

Easy Deployment with Persistent and Stateful Containers

 

Easy To Deploy

A Docker image is available on Docker Hub.

Image includes all the necessary bits—no more, no less—required to leverage MapR as a persistent data store for your containerized applications.

 

Secure

Authentication occurs at a container level to ensure containerized applications only have access to data for which they are authorized.

Communications are encrypted to ensure privacy when accessing data in MapR.

 

Extensible

A Dockerfile will also be available on GitHub, allowing you to further customize the image as needed to support your specific application needs.

 

Persistent

Container can easily leverage all the MapR Platform services (MapR-FS, MapR-DB, MapR Streams) as a persistent data store.

 

Provide Robust Visualization Support to Data Scientists

The MapR Data Science Refinery comes with 8 out-of-the-box visualization libraries, including MatPlotLib and GGPlot2. Apache Zeppelin provides a pluggable visualization framework to enable:

Common visualization libraries available in the NPM Registry

The ability to easily create and load custom visualizations

 

Enable Notebook/Model Collaboration, Sharing, and Mirroring

The MapR Converged Data Platform is ideal for storing model and notebook repositories. Organizations can leverage the MapR Platform’s global namespace and superior replication capability. The MapR Platform also offers immutable snapshots to persist and deploy various versions of the same model, making it possible for data scientists to compare the performance and accuracy of each version of the model.

Next:How Your Business Benefits from the MapR Data Science Refinery?
Prev:The MapR Data Science Refinery:Scalable Data Science Toolkit

© Copyright 2000-2023  COGITO SOFTWARE CO.,LTD. All rights reserved