Accepted Paper

Main Conference

Regular Paper

Chin-Chi Hsu, Perng-Hwa Kung, Mi-Yen Yeh, Shou-De Lin, and Phillip B. Gibbons,
Bandwidth-Efficient Distributed k-Nearest-Neighbor Search with Dynamic Time Warping
Liang Zhao, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan, Dynamic theme tracking in Twitter
Sean Massung and Chengxiang Zhai, SyntacticDiff: Operator-Based Transformation for Comparative Text Mining
Suprio Ray, Angela Demke Brown, Nick Koudas, Rolando Blanco, and Anil Goel, Parallel In-Memory Trajectory-based Spatiotemporal Topological Join
Tri Kurniawan Wijaya, Matteo Vasirani, Samuel Humeau, and Karl Aberer, Cluster-based Aggregate Forecasting for Residential Electricity Demand using Smart Meter Data
Bin Dong, Suren Byna, and Kesheng Wu, Spatially Clustered Join on Heterogeneous Scientific Data Sets
Yixian Zheng, Wenchao Wu, Huamin Qu, Chunyan Ma, and Lionel M. Ni, Visual Analysis of Bi-directional Movement Behavior
Toyotaro Suzumura, ScaleGraph 2: A Library for Billion-Scale Graph Analytics
Yuncheng Li and Jiebo Luo, User-Curated Image Collections: Modeling and Recommendation
Maria Malik and Houman Homayoun, System and Architecture Level Characterization of Big Data Applications on Big and Little Core Server Architectures
Chung-Yi Li, Wei-Lun Su, Todd G. McKenzie, Fu-Chun Hsu, Shou-De Lin, Phillip B. Gibbons, and Jane Yung-jen Hsu, Recommending Missing Sensor Values
Cheng-Te Li, Yu-Jen Lin, and Mi-Yen Yeh, The Roles of Network Communities in Social Information Diffusion
Wang Ke, Guo Ping, and Luo A-Li, Angular Quantization Based Affinity Propagation Clustering and its Application to Astronomical Big Spectra Data
Ashwin Lall, Data Streaming Algorithms for the Kolmogorov-Smirnov Test
Masayo Ota, Huy Vo, Claudio T. Silva, and Juliana Freire, A Scalable Approach for Data-Driven Taxi Ride-Sharing Simulation
Yibo Yao and Lawrence Holder, Scalable Classification for Large Dynamic Networks
Jilong Kuang, Daniel Waddington, and Changhui Lin, A Fast and Scalable Time Series Traffic Generator
Ruslan Mavlyutov and Philippe Cudré-Mauroux, CINTIA: a Distributed, Low-Latency Index for Big Interval Data
Katayoun Neshatpour, Maria Malik, Mohammad Ali Ghodrat, and Houman Homayoun, Energy-Efficient Acceleration of Big Data Analytics Applications Using FPGAs
Lorenz Fischer and Abraham Bernstein, Workload Scheduling in Distributed Stream Processors using Graph Partitioning
Arghya Kusum Das, Seung-Jong Park, Jaeki Hong, and Wooseok Chang, Evaluating Different Distributed-Cyber-Infrastructure for Data and Compute Intensive Scientific Application
Vincenzo Gulisano, Yiannis Nikolakopoulos, Marina Papatriantafilou, and Philippas Tsigas, ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join
Yang Wang and Kwan-Liu Ma, Revealing the Fog-of-War: A Visualization-directed, Uncertainty-aware Approach for Exploring High-dimensional Data
Jilong Xue, Zhi Yang, Shian Hou, and Yafei Dai, When Computing Meets Heterogeneous Cluster: Workload Assignment in Graph Computation
Bokai Cao, Francine Chen, Dhiraj Joshi, and Philip S. Yu, Inferring Crowd-Sourced Venues for Tweets
Vasilis Efthymiou, Kostas Stefanidis, and Vassilis Christophides, Big Data Entity Resolution: From Highly to Somehow Similar Entity Descriptions in the Web
Huanhuan Wu, James Cheng, Yi Lu, Yiping Ke, Yuzhen Huang, Da Yan, and Hejun Wu, Core Decomposition in Large Temporal Graphs
Vasilis Efthymiou, George Papadakis, George Papastefanatos, Kostas Stefanidis, and Themis Palpanas, Parallel Meta-blocking: Realizing Scalable Entity Resolution over Large, Heterogeneous Data
Ioanna Filippidou and Yiannis Kotidis, Online and On-demand Partitioning of Streaming Graphs
Christos Anagnostopoulos and Peter Triantafillou, Learning to Accurately COUNT with Query-Driven Predictive Analytics
Jason H.D. Cho, Yanen Li, Roxana Girju, and Chengxiang Zhai, Recommending Forum Posts to Designated Experts
Mark Gates, Hartwig Anzt, Jakub Kurzak, and Jack Dongarra, Accelerating Collaborative Filtering Using Concepts from High Performance Computing
Eldon Carman, Vassilis Tsotras, Till Westmann, Vinayak Borkar, and Michael Carey, A Scalable Parallel XQuery Processor
Wei Xie, Feida Zhu, Siyuan Liu, and Ke Wang, Modelling Cascades Over Time in Microblogs
Guoxin Liu, Haiying Shen, and Haoyu Wang, Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems
desheng Zhang, Ruobing Jiang, Shuai Wang, Yanmin Zhu, Bo Yang, Tian He, Jian Cao, and Fan Zhang, EveryoneCounts: Data-Driven Digital Advertising based on Uncertain Demand Models in Metro Networks
Eser Kandogan, Mary Roth, Peter Schwarz, Joshua Hui, Ignacio Terrizzano, Christina Christodoulakis, and Renee Miller, LabBook: Metadata-driven Social Collaborative Data Analysis
Liang Zhao, WenZhan Song, and Xiaojing Ye, Fast Decentralized Gradient Descent Method and Applications to In-situ Seismic Tomography
Yasser Salem, Jun Hong, and Weiru Liu, CSFinder: A Cold-Start Friend Finder in Large-Scale Social Networks
Nam-Luc Tran, Thomas Peel, and Sabri Skhiri, Distributed Frank-Wolfe under Pipelined Stale Synchronous Parallelism
Michele Bertoni, Stefano Ceri, Abdulrahman Kaitoua, and Pietro Pinoli, Evaluating Cloud Frameworks on Genomic Applications
Chenxi Qiu, Haiying Shen, and Liuhua Chen, Towards Green Cloud Computing: Demand Allocation and Pricing Policies for Cloud Service Brokerage
Hien To, Seon Ho Kim, and Cyrus Shahabi, Effectively Crowdsourcing the Acquisition and Analysis of Visual Data for Disaster Response
Zhen Chen, Hanghang Tong, and Lei Ying, Full Diffusion History Reconstruction in Networks
Demetris Trihinas, George Pallis, and Marios Dikaiakos, AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
Zhao Zhang, Kyle Barbary, Frank Austin Nothaft, Evan Sparks, Oliver Zahn, Michael J. Franklin, David A. Patterson, and Saul Perlmutter, Scientific Computing Meets Big Data Technology: An Astronomy Use Case
Nikos Zacheilas, Vana Kalogeraki, Nikolas Zygouras, Nikolaos Panagiotou, and Dimitrios Gunopulos, Elastic Complex Event Processing exploiting Prediction
Huseyin Ulusoy, Murat Kantarcioglu, and Erman Pattuk, TrustMR: Computation Integrity Assurance system for MapReduce
Suchismit Mahapatra and Varun Chandola, Modeling Graphs Using a Mixture of Kronecker Models
Huseyin Ulusoy, Murat Kantarcioglu, Erman Pattuk, and Lalana Kagal, AccountableMR: Toward Accountable MapReduce systems
Stephen Bonner, A. Stephen McGough, Ibad Kureshi, John Brennan, Georgios Theodoropoulos, Laura Moss, David Corsar, and Grigoris Antoniou, Data Quality Assessment and Anomaly Detection Via Map / Reduce and Linked Data: A Case Study in the Medical Domain
tian guo, Jean-Paul Calbimonte, Hao Zhuang, and Karl Aberer, SigCO: Mining Significant Correlations via a Distributed Real-time Computation Engine
Eleazar Leal, Le Gruenwald, Jianting Zhang, and Simin You, TKSimGPU: A Parallel Top-K Trajectory Similarity Query Processing Algorithm for GPGPUs and Multicore CPUs
Michael Nalisnik, David Gutman, Jun Kong, and Lee Cooper, An Interactive Learning Framework for Scalable Classification of Pathology Images
Yen-Kai Wang, Wei-Ming Chen, Cheng-Te Li, and Shou-De Lin, Identifying Smallest Unique Subgraphs in a Heterogeneous Social Network
Jiejun Xu and Tsai-Ching Lu, Toward Precise User-Topic Alignment in Online Social Media
Xi Yang, Ning Liu, Bo Feng, Xian-He Sun, and Shujia Zhou, PortHadoop: Support Direct HPC Data Processing in Hadoop
John Canny, Huasha Zhao, Ye Chen, Jiangchang Mao, and Bobby Jaros, Machine Learning at the Limit
Masahiko Itoh, Daisaku Yokoyama, Masashi Toyoda, and Masaru Kitsuregawa, Visual Interface for Exploring Caution Spots from Vehicle Recorder Big Data
Nusrat Islam, Md. Wasi-ur- Rahman, Xiaoyi Lu, Dipti Shankar, and Dhabaleswar K. Panda, Performance Characterization and Acceleration of In-Memory File Systems for Hadoop and Spark Applications on HPC Clusters
Serafettin Tasci and Murat Demirbas, PANOPTICON: A lock broker architecture for scalable transactions in the datacenter
Bogdan Simion, Daniel Ilha, Suprio Ray, Leslie Barron, Angela Demke Brown, and Ryan Johnson, Slingshot: A Modular Framework for Designing Data Processing Systems

Short Paper

Amir Bahmani and Frank Mueller, ACURDION: An Adaptive Clustering-based Algorithm for Tracing Large-scale MPI Applications
Max Watson, Time Maps: A Tool for Visualizing Many Discrete Events Across Multiple Timescales
Dongfang Zhao, Nagapramod Mandagere, Gabriel Alatorre, Mohamed Mohamed, and Heiko Ludwig, Diego/P: Toward Locality-aware Scheduling for Containerized Cloud Services
Min Du and Feifei Li, ATOM: Automated Tracking, Orchestration, and Monitoring of Resource Usage in Infrastructure as a Service Systems
Xugang Ye, Zijie Qi, and Jingjing Li, Learning Relevance from Click Data via Neural Network based Similarity Models
I. Stephen Choi, Weiqing Yang, and Yang-Suk Kee, Early Experience with Optimizing I/O Performance Using High-Performance SSDs for In-Memory Cluster Computing
Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin, Large-scale Parallel Combinatorial Optimization through Belief Propagation
Anand Tripathi and BhagavathiDhass Thirunavukarasu, A Transaction Model for Management of Replicated Data with Multiple Consistency Levels
Yu Wang and Jiebo Luo, America Tweets China: A Fine-Grained Analysis of the State and Individual Characteristics Regards Attitudes towards China
Yu Jin, Joseph JaJa, Rong Chen, and Edward Herskovits, A Data-Driven Approach to Extract Connectivity Structures from Diffusion Tensor Imaging Data
Georgios Chatzigeorgakidis, Sophia Karagiorgou, Spiros Athanasiou, and Spiros Skiadopoulos, A MapReduce Based k-NN Joins Probabilistic Classifier
Alessandro Lulli, Thibault Debatty, Laura Ricci, Matteo Dell'Amico, and Pietro Michiardi, Scalable k-NN based text clustering
Yuwen Chen, Jian Cao, Shanshan Feng, and Yudong Tan, An Ensemble Learning Based Approach for Building Airfare Forecast Service
Mack Sweeney, Jaime Lester, and Huzefa Rangwala, Next-Term Student Grade Prediction
Sofia Apreleva and Alejandro Cantarero, Predicting the Location of Users on Twitter from Low Density Graphs
Chad Steed, Margaret Drouhard, Justin Beaver, Joshua Pyle, and Paul Bogen, Matisse: A Visual Analytics System for Exploring Emotion Trends in Social Media Text Streams
Philip S. Yu and Sihong Xie, Robust Crowd Bias Correction via Dual Knowledge Transfer from Multiple Overlapping Sources
Dongyao Wu, Sherif Sakr, Liming Zhu, and Qinghua Lu, Composable and Efficient Functional Big Data Processing Framework
Salvador Aguinaga, Aditya Nambiar, Zuozhu Liu, and Tim Weninger, Concept Hierarchies and Human Navigation
Hyunjoo Kim, Sriganesh Madhvanath, and Tong Sun, Hybrid Active Learning for Non-stationary Streaming Data with Asynchronous Labeling
Jianting Zhang, Simin You, and Le Gruenwald, Quadtree-Based Lightweight Data Compression for Large-Scale Geospatial Rasters on Multi-Core CPUs
Srikant Padala, Dinesh Kumar, Arun Raj, and Janakiram Dharanipragada, Octopus: A Multi-tenant Scheduler for Graphlab
Deepika Lalwani, Somayajulu D. V. L. N., and Radha Krishna Pisipati, A Community Driven Social Recommendation System
Ruben Tous, Anastasios Gounaris, Carlos Tripiana, Jordi Torres, Sergi Girona, Eduard Ayguadé, Jesús Labarta, Yolanda Becerra, David Carrera, and Mateo Valero, Spark Deployment and Performance Evaluation on the MareNostrum Supercomputer
Elias Alevizos, Alexander Artikis, Kostas Patroumpas, Marios Vodas, Yannis Theodoridis, and Nikos Pelekis, How not to drown in a sea of information: An event recognition approach
Zhenhua Chen, Jielong Xu, Jian Tang, Kevin Kwiat, and Charles Kamhoua, G-Storm: GPU-enabled High-throughput Online Data Processing in Storm
Orcun Yildiz, Shadi Ibrahim, Tran Anh Phuong, and Gabriel Antoniu, Chronos: Failure-Aware Scheduling in Shared Hadoop Clusters
Yongfeng Zhang, Task-based Recommendation on a Web-Scale
Xiaowei Jia, Aosen Wang, Xiaoyi Li, Guangxu Xun, Wenyao Xu, and Aidong Zhang, Multi-modal Learning for Video Recommendation based on Mobile Application Usage
Jiaoyan Chen, Huajun Chen, Daning Hu, Yalin Zhou, and Ming Wu, Smog Disaster Forecasting using Social Web Data and Physical Sensor Data
Roee Ebenstein and Gagan Agrawal, DSDQuery DSI - Querying Scientific Data Repositories with Structured Operators
Xiaoyi Li, Xiaowei Jia, and Aidong Zhang, Improving EEG Feature Learning via Synchronized Facial Video
muyi liu and Michael Gribskov, MMC-Margin: Identification of Maximum Frequent Subgraphs by Metropolis Monte Carlo Sampling
Smruti Padhy, Greg Jansen, Jay Alameda, Edgar Black, Liana Diesendruck, Mike Dietze, Praveen Kumar, Rob Kooper, Jong Lee, Riu Liu, Ricard Marciano, Luigi Marini, Dave Mattson, Barbara Minsker, Chris Navarro, Marcus Slavenas, William Sullivan, Jason Votava, and Kenton McHenry, Brown Dog: Leveraging Everything Towards Autocuration
Afsin Akdogan, Saratchandra Indrakanti, Ugur Demiryurek, and Cyrus Shahabi, Cost-Efficient Partitioning of Spatial Data on Cloud
Kamalika Das, Kanishka Bhaduri, Bryan Matthews, and Nikunj Oza, Large scale support vector regression for aviation safety
Yue Wang, Ke Wang, Ada Wai-Chee Fu, and Raymond Sin-Kwok Wong, KeyLabel Algorithm for Keyword Search in Large Graphs
Enric Junqué de Fortuny, Theodoros Evgeniou, David Martens, and Foster Provost, Iteratively Refining SVMs
Harish Bhat, Nitesh Kumar, and Garnet Vaz, Towards Scalable Quantile Regression Trees
Dapeng Dong and John Herbert, Record-aware Two-level Compression for Big Textual Data Analysis Acceleration
Zhen Xie and Sencun Zhu, You Can Promote, But You Can’t Hide: Large-Scale Abused App Detection in Mobile App Stores
Kosuke Nakabasami, Toshiyuki Amagasa, Salman Shaikh, Franck Gass, and Hiroyuki Kitagawa, An Architecture for Stream OLAP Exploiting SPE and OLAP Engine
Chung-Hsien Yu, Dong Luo, Wei Ding, Joseph Cohen, David Small, and Shafiqul Islam, Spatio-Temporal Asynchronous Co-Occurrence Pattern for Big Climate Data towards Long-Lead Flood Prediction
Wei Xie, Jiang Zhou, Mark Reyes, Jason Noble, and Yong Chen, Two-Mode Data Distribution Scheme for Heterogeneous Storage in Data Centers
Teng Li, Jian Tang, and Jielong Xu, A Predictive Scheduling Framework for Fast and Distributed Stream Data Processing
Anthony Kleerekoper, Michael Pappas, Mikel Lujan, Gavin Brown, and Adam Pocock, A Scalable Implementation of Information Theoretic Feature Selection for High Dimensional Data
Lorenzo Gabrielli, Barbara Furletti, Roberto Trasarti, Fosca Giannotti, and Dino Pedreschi, City users’ classification with mobile phone data
Anas Abu-Doleh and Umit Catalyurek, Spaler: Spark and GraphX based de novo genome assembler
Pouria Pirzadeh, Michael Carey, and Till Westmann, BigFUN: A Performance Study of Big Data Management System Functionality
S M Faisal, G. Tziantzioulis, A. M. Gok, S. Parthasarathy, N. Hardavellas, and S. Ogrenci-Memik, Edge Importance Identification for Energy Efficient Graph Processing
Florin Schimbinschi, Xuan Vinh Nguyen, James Bailey, Chris Leckie, Hai Vu, and Ramamohanarao Kotagiri, Traffic Forecasting In Complex Urban Networks: Leveraging Big Data and Machine Learning
Kilho Shin, Tetsuji Kuboyama, Takako Hashimoto, and Dave Shepard, Super-CWC and Super-LCC: Super Fast Feature Selection Algorithms
Tonglin Li, Ke Wang, Shiva Srivastava, Dongfang Zhao, Kan Qiao, Iman Sadooghi, Xiaobing Zhou, and Ioan Raicu, A Flexible QoS Fortified Distributed Key-Value Storage System for the Cloud
Mahdi Ebrahimi, Aravind Mohan, Shiyong Lu, and Robert Reynolds, TPS: A Task Placement Strategy for Big Data Workflows
Keira Zhou, Jack Wadden, Jeffrey Fox, Ke Wang, Donald Brown, and Kevin Skadron, Regular Expression Acceleration on the Micron Automata Processor: Brill Tagging as a Case Study
Karla Caballero Barajas and Ram Akella, Prediction of Physiological Subsystem Failure and its Impact in the prediction of Patient Mortality
Yuqing ZHU, Yilei WANG, and Fan WANG, Improving Transaction Processing Performance By Consensus Reduction
Dipti Shankar, Xiaoyi Lu, Md. Wasi-ur-Rahman, Nusrat Islam, and Dhabaleswar K. Panda, Benchmarking Key-Value Stores on High-Performance Storage and Interconnects for Web-Scale Workloads
Luca Pappalardo, Dino Pedreschi, and Fosca Giannotti, Human Mobility and Economic Development
Roberto Tardío Olmos, Alejandro Maté Morga, and Juan Carlos Trujillo Mondéjar, An Iterative Methodology for Big Data Management, Analysis and Visualization
Robert P. Trevino, Steve A. Kawamoto, Thomas J. Lamkin, and Huan Liu, Cell Analytics in Compound Hit Selection of Bacterial Inhibitors
Don Libes, Seungjun Shin, and Jungyub Woo, Considerations and Recommendations for Data Contributions for Predictive Analytics for Manufacturing
Padmashree Ravindra, HyeongSik Kim, and Kemafor Anyanwu, Rewriting Complex SPARQL Analytical Queries for Efficient Cloud-based Processing
Li-Yung Ho, Fei Shao, Jan-Jan Wu, and Pangfeng Liu, Efficient Distributed Maximum Matching for Solving the Container Exchange Problem in the Maritime Industry

Industry and Government Program

Regular Paper

Xiuqiang He, Wenyuan Dai, Guoxiang Cao, Huyang Sun, Mingxuan Yuan, and Qiang Yang, Mining Target Users for Online Marketing based on App Store Data
Ahmed Metwally, Jia-Yu Pan, Minh Doan, and Christos Faloutsos, Scalable Community Discovery from Multi-Faceted Graphs
Ernesto Diaz-Aviles, Fabio Pinelli, Karol Lynch, Zubair Nabi, Yiannis Gkoufas, Eric Bouillet, Francesco Calabrese, Eoin Coughlan, Peter Holland, and Jason Salzwedel, Towards Real-time Customer Experience Prediction for Telecommunication Operators
I. Stephen Choi, Weiqing Yang, and Yang-Suk Kee, Early Experience with Optimizing I/O Performance Using High-Performance SSDs for In-Memory Cluster Computing
Hyunsik Choi, Yong In Lee, Jongyoung Park, Kangho Roh, and Kwanghyun La, An Evaluation of Alternative Shared-nothing Architecture for Analytical Processing Systems
Anjan Goswami, Wei Han, Zhenrui Wang, and Angela Jiang, Controlled Experiments for Decision-Making in e-Commerce Search
Jenny Williams, Paul Cuddihy, Justin McHugh, Kareem Aggour, and Arvind Menon, Semantics for Big Data Access & Integration: Improving Industrial Equipment Design through Increased Data Usability
Laura Rettig, Mourad Khayati, Michal Piorkowski, and Philippe Cudre-Mauroux, Online Anomaly Detection over Big Data Streams
Aungon Nag Radon, Ke Wang, Uwe Glaesser, Hans Wehn, and Andrew Westwell-Roper, Contextual Verification for False Alarm Reduction in Maritime Anomaly Detection
Tanay Saha, Mohammad Hasan, Chandler Burgess, Md Ahsan Habib, and Jeff Johnson, Batch Mode Active Learning for Technology-Assisted Review
Mayank Kejriwal, Qiaoling Liu, Ferosh Jacob, and Faizan Javed, A Pipeline for Extracting and Deduplicating Domain-Specific Knowledge Bases
Fang-Hsiang Su, Manas Somaiya, Shrish Mishra, and Rajyashree Mukherjee, EXOS: EXpansion On Session for Enhancing Effectiveness of Query Auto-Completion
Gergely Acs, Jagdish Prasad Achara, and Claude Castelluccia, Probabilistic km-anonymity (Efficient Anonymization of Large Set-Valued Datasets)
Sauptik Dhar, Congrui Yi, Naveen Ramakrishnan, and Mohak Shah, ADMM based Scalable Machine Learning on Spark
Dapeng Dong and John Herbert, Record-aware Compression for Big Textual Data Analysis Acceleration
Alekh Jindal, Samuel Madden, Malú Castellanos, and Meichun Hsu, Graph Analytics using Vertica Relational Database
Andre Luckow, Ken Kennedy, Fabian Manhard, Emil Djerekarov, Bennie Vorster, and Amy Apon, Automotive Big Data: Applications, Workloads and Infrastructures
Goktug Cinar, Jeffrey Thompson, and Soundar Srinivasan, Cost-sensitive optimization of automated inspection
Nicolas Poggi, Josep Lluís Berral, David Carrera, Aaron Call, Rob Reinauer, Nikola Vujic, Daron Green, José Blakeley, and Fabrizio Gagliardi, From Performance Profiling to Predictive Analytics while Evaluating Hadoop Cost-Efficiency in ALOJA
Mohammed Korayem, Camilo Ortiz, Khalifeh AlJadda, and Trey Grainger, Query Sense Disambiguation Leveraging Large Scale User Behavioral Data
Viet Ha-Thuc, Ganesh Venkataraman, Mario Rodriguez, Shakti Sinha, Senthil Sundaram, and Lin Guo, Personalized Expertise Search at LinkedIn
Vinay Deolalikar, How Valuable is Your Data? A Quantitative Approach using Data Mining
Kang Li, Vinay Deolalikar, and Neeraj Pradhan, Mining Lifestyle Personas at Scale in E-commerce
Petros Zerfos, Hangu Yeo, Brent Paulovicks, and Vadim Sheinin, SDFS: Secure Distributed File System for Data-at-Rest Security for Hadoop-as–a-Service
Sreenivas Sukumar, Open Research Challenges with Big Data - A Data-Scientists Perspective
Hamed Yaghoubi Shahir, Uwe Glässer, Amir Yaghoubi Shahir, and Hans Wehn, Maritime Situation Analysis Framework: Vessel Interaction Classification and Anomaly Detection
Levente Klein, Fernando Marianno, Conrad Albrecht, Marcus Freitag, Siyuan Lu, Nigel Hinds, Hendrik Hamann, and Sergio Bermudez, PAIRS: A scalable geo-spatial data analytics platform
Jayasimha Katukuri, Tolga Konik, Rajyashree Mukherjee, and Santanu Kolay, Post-Purchase Recommendations in Large-scale Online Marketplaces
Hong-Han Shuai, Chih-Ya Shen, Hsiang-Chun Hsu, De-Nian Yang, Chung-Kuang Chou, Jihg-Hong Lin, and Ming-Syan Chen, Revenue Maximization for Telecommunications Company with Social Viral Marketing

Short Paper

Stephanie Rosenthal, Scott McMillan, and Matthew Gaston, Developer Toolchains for Large-Scale Analytics: Two Case Studies
Harshal Godhia, Bibek Panda, Swarnim Narayan, and Ramakrishna Vadakattu, Enterprise Subscription Churn Prediction
Joshua Seeger, Aron Culotta, Jason Keller, Patrick van Kessel, and Michael Jugovich, Data Deidentification in Medical Transcriptions using Regular Expressions and Machine Learning
Qinlong Luo, Meng Zhao, Faizan Javed, and Ferosh Jacob, Macau: Large-Scale Skill Sense Disambiguation in the Online Recruitment Domain
Wei Yi Liu, Morris H. Hsiao, and Shih Yao Dai, DNA Analysis with MapReduce
Chaitali Gupta, Ranjan Sinha, and Yong Zhang, Eagle: User Profile-based Anomaly Detection in Hadoop Clusters
Manuel Diaz-Granados, Javier Diaz-Montes, and Manish Parashar, Investigating Insurance Fraud using Social Media
Luca Cazzanti, Leonardo Millefiori, and Gianfranco Arcieri, A Document-based Data Model for Large Scale Computational Maritime Situational Awareness
Jhao-Yin Li, Mi-Yen Yeh, Ming-Syan Chen, and Jihg-Hong Lin, Modeling Social Influences from Call Records and Mobile Web Browsing Histories (Extended Abstract)
Christian Seebode, Matthias Ort, Peter Hufnagl, and Christian R. A. Regenbrecht, Next Generation Biobanks