Most of the core business data in an enterprise are stored in relational databases that support Structure Query Language (SQL). When data scientists perform tasks such as data cleanse, validation, manipulation and feature variable calculation using SQL within database environments, they can achieve important advantages such as more compact code, easier deployment and higher security in comparison to moving the data outside of the database to a separate analytics environment.In this course, students will learn practical Oracle SQL skills to solve problems such as:
- Data Validation
- Data Summary
- Missing Value Handling
- Detect and Remove Duplicates
- Binning Variable Based on Equal Frequency
- Build Good Variables for Predictive Models or Business Rules, e.g., RFM Analysis, Time Elapse Since Last Purchase, Number of Transactions in Last 3 Days, Moving Average Purchase Amount in Last 7 Days
- Random Sampling
- Gain Chart
- Using View to Organize Process Flows
There are totally about 4 hours and 30 minutes video presentations. SQL scripts that create data sets and perform the Data Science tasks are provided. Slides are included so that students can easily find the topics that they interested in.
This course requires that students already have basic knowledge about SQL.
The instructor, Dr. Jiang Zhou , has nearly two decades of experience building predictive analytics solutions across industries including telecommunication, banking and insurance. These solutions have resulted in over $200 million savings for clients. Dr. Zhou has founded/co-founded 3 data analytics companies, i.e., Business Data Miners (http://businessdataminers.com/), ZSAnalytics (http://zsanalytics.com/) and 慧信 (Smart Credit). He is the Chief Analytics Officer of ZSAnalytics and the Chief Technology Officer of 慧信. Previously, he was a senior modeling engineer at Nestor, Inc., the chief statistician at Lightbridge, Inc., a vice president at Citizens Bank and a consulting member of technical staff at Oracle Inc. He has made presentations about data analytics at American Bankers Association(ABA) conferences, Predictive Analytics World ( "Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study" at Predictive Analytics World), universities and other events.
Dr. Zhou has been involved in 3 head to head competitions to build the best predictive models, i.e., a customer credit risk model for a top 3 cell phone company, a bank card fraud detection model for a top 15 bank, and a direct sales model for a marketing company. Dr. Zhou's models have won all 3 competitions.
In addition to his technical skills, he is, as one of his clients put it, "a great trainer, and a good presenter of theoretical data mining concepts so that they can be understood by most". He is the author of the blog www.deep-data-mining.com that is regarded as one of the best on data analytics and data mining.
Dr. Zhou is the primary author of an award-wining scientific paper, "Using genetic learning artificial neural networks for spatial decision making in GIS" (Nov., 1996, PE & RS). His scientific paper titled "A wavelet transform method to merge Landsat TM and SPOT panchromatic data" (1998,International Journal of Remote Sensing) is one of the most cited in the field.
Dr. Zhou obtained a B.S. from Zhejiang University, M.S. from Beijing University, and a Ph.D. from the University of Connecticut.