Home

Welcome to the SCALPEL-Analysis wiki!

This will guide you to understand and fully use the SCALPEL-Analysis library.

SCALPEL: A Scalable Pipeline

As SCALPEL-Flattening and SCALPEL-Extraction perform batch operations, they need to read (resp. write) input (resp. output) data from the file-system (local or HDFS). They are implemented in Scala in order to access Spark's low-level API and take advantage of functional programming and static typing, resulting in rigorous automated testing (94% of the Scala code is covered by unit tests). Both can be configured through textual configuration files or be used as libraries. SCALPEL-Analysis is a python module implemented in Python/PySpark and designed for interactive use. It can be used in a Jupyter notebook for instance. This workflow is illustrated in following Fig.

jpg

SCALPEL3

Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Home

SCALPEL: A Scalable Pipeline

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally