Big Data Management and Analytics Workshop

Massive amounts of structured and unstructured data is being generated from online user activities on e-commerce and social-media websites. Big Data, as this massive amount of data is called, leads to many challenges:

  • How do companies manage and process this massive amount of data?
  • How do companies automatically learn hidden trends and patterns in this data?
  • How do companies gather actionable intelligence to improve their bottom line?

This workshop will cover tools and techniques that directly address these challenges. Taught by world-renowned data science experts, this workshop is meant for technical managers and software developers. No advanced knowledge is needed.

  • Storing and processing Big Data: Hadoop, Mapreduce, NoSQL
  • Tools and techniques for analyzing Big Data
  • Fundamentals of learning from Big Data
  • Practical machine learning tools for Big Data
  • Processing unstructured and natural language data
  • Practical applications of Big Data

The workshop will consist of lectures and hand-on practice labs. After attending this workshop you will be able to manage, analyze and learn from Big Data using commonly available tools.

Workshop fee: $995
Please register here by November 14, 2013.

Hyatt Regency - Rate: $89/night
972-231-9600 / 866-593-6300

The lectures will be given by world-renowned experts in the areas of data sciences, data mining, machine learning and text processing:

  • Dr. Latifur Khan, Ph.D., Computer Science, University of Southern California
  • khan

    Dr. Khan is an expert on Big Data analytics and management. He has been working in the data mining and management areas for over 15 years. To date, he has developed a number of scalable algorithms to process queries over very large amounts of complex data using cloud computing frameworks (e.g., Hadoop, Cassandra, and NoSQL etc.). He also develops novel mining techniques to identify unknown patterns from evolving continuous data streams. His developed approaches have been applied successfully to a number of domains such as cyber-security, social network, and semantic web. For example, IARPA and Raytheon funded his large scale semantic web graph processing and retrieval. Tektronix funded his research on analyzing telecommunications logs for performance monitoring and quality assurance using NoSQL data models. Currently, he is teaching a graduate level course related to Big Data management and analytics topics.

  • Dr. Vibhav Gogate, Ph.D., Computer Science, The University of California at Irvine. Post-doctoral fellow, The University of Washington
  • vibhav

    Dr. Gogate's research interests are in artificial intelligence, machine learning and data mining. His ongoing focus is on probabilistic graphical models and their first-order logic based extensions such as Markov logic. He has published over 25 papers in top-tier conferences and journals such as AAAI, UAI, NIPS, AISTATS, AIJ and JAIR. He is the co-winner of the last two probabilistic inference competitions - the 2010 UAI approximate inference challenge and the 2012 PASCAL probabilistic inference competition. Dr. Gogate's group is currently developing "Alchemy 2.0," a general-purpose software package for inference and learning in Markov logic. The package is based on a novel framework called probabilistic theorem proving, which is a family of lifted inference algorithms that can yield substantial, potentially infinite speedups over inference algorithms for probabilistic graphical models. Alchemy 2.0 allows the user to easily develop a wide range of real-world applications, including: Link prediction, entity (co-reference) resolution, social network modeling and information extraction

  • Dr. Vincent Ng, Ph.D., Computer Science, Cornell University. Director, Center for Machine Learning and Language Processing of the Human Language Technology Research Institute at UTD.
  • vincent

    Dr. Ng's primary area of research is machine learning for natural language processing. His recent projects have focused on developing machine learning techniques to reduce the amount of annotated data needed to build natural language applications. Professor Ng's project on extraction and categorization of social media posts over time aims to develop machine learning techniques to extract entities and relations from social media posts and track changes in people's opinions over time. His project on semantics-based, weakly-supervised conference resolution aims to investigate weakly supervised learning techniques to port an English conference resolution system developed for the news domain to other languages and other domains.


  • Dr. Yang Liu, Ph.D., Electrical Engineering, Purdue University. Post-doctoral fellowship, University of California, Berkeley.
  • liu

    Dr. Liu's research areas are in speech and natural language processing using machine learning techniques. One of her research aims is to generate summaries for easy access to overloaded information (e.g., generating summaries for meetings, news articles, Twitter trending topics). Another focus of her research is on processing heterogeneous data - data from different modalities (speech or text), formal or informal genre (e.g., social media data), and structured or unstructured data. In addition, she also works on generating information targeted for particular users, for example, recommending interesting threads for a user in forum discussions. Two of her recent projects are in the area of speech summarization that aims to generate summaries for speech recordings (e.g., multiparty meetings) and deep exploration and filtering of text that aims to extract information and identify anomaly from a large collection of speech and text data.