We're still actively developing this site. If you encounter any issues, please report them! - Report an issue

DATA EXPLORATION, CLEANING, AND INTEGRATION FOR DATA SCIENCE

COMPSCI 774
Course Description

Big Data is often said to deal with four Vs: volume, velocity, variety, and veracity. The focus is on variety and veracity challenges, which often arise in data science projects. In many such projects, data is often incorrect, hard to understand, and come from a variety of sources. Data scientists often spend 80% of their effort to explore, clean, and integrate this data, before analysis can be carried out to extract insights. As a result, managing variety and veracity has received significant attention. Study these topics, understand their challenges, and discuss solutions. These solutions often require data management, machine learning, big data scaling, cloud, crowdsourcing, and user interaction techniques. Knowledge of machine learning/AI [COMP SCI 540], databases [COMP SCI 564] and Python [COMP SCI 320] recommended.

Prerequisites

Graduate/professional standing

Satisfies
Credits

Not Reported

Offered

Not Reported

Grade Point Average
3.97

No change from Historical

Completion Rate
100%

No change from Historical

A Rate
94.12%

No change from Historical

Class Size
51

No change from Historical

Cumulative Grade Distribution

Instructors (2026 Summr)

Sorted by ratings from Rate My Professors

Similar Courses