Visual and Science computing in MapReduce framework

 Vispark : GPU-Accelerated distributed visual computing using spark

With the growing need of big-data processing in diverse application domains, MapReduce (e.g., Hadoop) has become one of the standard computing paradigms for large-scale computing on a cluster system. Despite its popularity, the current MapReduce framework su ers from in exibility and ineciency inherent to its programming model and system architecture. In order to address these problems, we propose Vispark, a novel extension of Spark for GPU-accelerated MapReduce processing on array-based scienti c computing and image processing tasks. Vispark provides an easy-to-use, Python-like high-level language syntax and a novel data abstraction for MapReduce programming on a GPU cluster system. Vispark introduces a programming abstraction for accessing neighbor data in the mapper function, which greatly simpli es many image processing tasks using MapReduce by reducing memory footprints and bypassing the reduce stage. Vispark provides socket-based halo communication that synchronizes between data partitions transparently from the users, which is necessary for many scienti c computing problems in distributed systems. Vispark also provides domain-speci c functions and language supports speci cally designed for high-performance computing and image processing applications.

def meanfilter (data , x, y):
  u = point_query_2d (data , x , y +1)
  d = point_query_2d (data , x , y -1)
  r = point_query_2d (data , x+1, y )
  l = point_query_2d (data , x -1, y )
  ret = (u+d+r+l) /4.0
  return ((x,y),ret )

if __name__ == " __main__ ":
  sc = SparkContext ( appName =" meanfilter_vispark ")
  img = np. fromstring ( Image . open (" lenna . png "). tostring ())
  imgRDD = sc. parallelize (img , Tag =" VISPARK ")
  imgRDD = imgRDD . vmap ( meanfilter (data , x, y). range (512 , 512) )
  ret = np. array ( sorted ( imgRDD . collect ())) [: ,1]. astype (np. uint8 )
  Image . fromstring ("L", (512 ,512) , ret . tostring ()). save (" out .png ")
Simple Vispark Mean image filter example code
  • [PDF] [DOI] W. Choi, S. Hong, and W. Jeong, “Vispark: GPU-Accelerated Distributed Visual Computing Using Spark,” SIAM Journal on Scientific Computing (SISC), vol. 38, iss. 5, p. S700-S719, 2016.
    [Bibtex]
    @article{woohyuk_2016_vispark,
    author={Woohyuk Choi and Sumin Hong and Won-Ki Jeong},
    title={{Vispark: {GPU}-Accelerated Distributed Visual Computing Using Spark}},
    journal={{SIAM Journal on Scientific Computing (SISC)}},
    publisher={Society for Industrial and Applied Mathematics},
    volume = {38},
    number = {5},
    pages = {S700-S719},
    year = {2016},
    doi = {10.1137/15M1026407},
    URL = {
    http://dx.doi.org/10.1137/15M1026407
    },
    eprint = {
    http://dx.doi.org/10.1137/15M1026407
    }
    }

GPU in-memory processing using Spark for iterative computation

Due to its simplicity and scalability, MapReduce has become a de facto standard computing model for big data processing. Since the original MapReduce model was only appropriate for embarrassingly parallel batch processing, many follow-up studies have focused on improving the efficiency and performance of the model. Spark follows one of these recent trends by providing in-memory processing capability to reduce slow disk I/O for iterative computing tasks. However, the acceleration of Spark’s in-memory processing using graphics processing units (GPUs) is challenging due to its deep memory hierarchy and host-to-GPU communication overhead. In this paper, we introduce a novel GPU-accelerated MapReduce framework that extends Spark’s in-memory processing so that iterative computing is performed only in the GPU memory. Having discovered that the main bottleneck in the current Spark system for GPU computing is data communication on a Java virtual machine, we propose a modification of the current Spark implementation to bypass expensive data management for iterative task offloading to GPUs. We also propose a novel GPU in-memory processing and caching framework that minimizes host-to-GPU communication via lazy evaluation and reuses GPU memory over multiple mapper executions. The proposed system employs message-passing interface (MPI)-based data synchronization for inter-worker communication so that more complicated iterative computing tasks, such as iterative numerical solvers, can be efficiently handled.

  • [PDF] S. Hong, W. Choi, and W. Jeong, “GPU in-memory processing using spark for iterative computation,” 2017 17th IEEE/ACM international symposium on cluster, cloud and grid computing (CCGrid), 2017.
    [Bibtex]
    @article{hong_ccgrid_2017,
    title = "{GPU} in-memory processing using Spark for iterative computation",
    year = "2017",
    author = "Sumin Hong and Woohyuk Choi and Won-Ki Jeong",
    journal = {2017 17th {IEEE}/{ACM} International Symposium on Cluster, Cloud and Grid Computing ({CCG}rid)}
    }

Distributed Interactive Visualization using GPU-optimized Spark

With the advent of advances in imaging and computing technology, large-scale data acquisition and processing has become commonplace in many science and engineering disciplines. Conventional workflows for large-scale data processing usually rely on in-house or commercial software that is designed for domain-specific computing tasks. Recent advances in MapReduce, which was originally developed for batch processing textual data via a simplified programming model of the map and reduce functions, have expanded its applications to more general tasks in big-data processing, such as scientific computing and biomedical image processing. However, as shown in previous work, volume rendering and visualization using MapReduce is still considered challenging and impractical due to the disk-based, batch-processing nature of its computing model. In this paper, contrary to this common belief, we show that the MapReduce computing model can be effectively used for interactive visualization. Our proposed system is a novel extension of Spark, one of the most popular open-source MapReduce frameworks, that offers GPU-accelerated MapReduce computing. To minimize CPU-GPU communication and overcome slow, disk-based shuffle performance, the proposed system supports GPU in-memory caching and MPI-based direct communication between compute nodes. To allow for GPU-accelerated in-situ visualization using raster graphics in Spark, we leveraged the CUDA-OpenGL interoperability, resulting in faster processing speeds by several orders of magnitude compared to conventional MapReduce systems.