Anna University, Subject code – CS3352, deals with the B.E Computer Science and Engineering Semester -III Foundations Of Data Science syllabus regulation 2021 relating to affiliated institutions. From here, Students can get assistance in preparing notes to excel in academic performance.
We include every topic of the Foundations Of Data Science Syllabus, to understand the subject very well. It will help you to improve your idea of syllabus of CS3352-Foundations Of Data Science Syllabus on your finger tips to go ahead in a clear path of preparation. In this following article Foundations Of Data Science Syllabus, will help you, Hope you share with your friends.
If you want to know more about the syllabus of B.E Computer Science and Engineering connected to an affiliated institution’s under four-year undergraduate degree programme. We provide you with a detailed Year-wise, semester-wise, and Subject-wise syllabus in the following link B.E Computer Science and Engineering Syllabus Anna University, Regulation 2021.
Aim Of Concept:
- To understand the data science fundamentals and process.
- To learn to describe the data for the data science process.
- To learn to describe the relationship between data.
- To utilize the Python libraries for Data Wrangling.
- To present and interpret data using visualization libraries in Python
CS3352 -Foundations Of Data Science Syllabus
Unit I: Introduction
Data Science: Benefits and uses – facets of data – Data Science Process: Overview – Defining research goals – Retrieving data – Data preparation – Exploratory Data analysis – build the model– presenting findings and building applications – Data Mining – Data Warehousing – Basic Statistical descriptions of Data
Unit II: Describing Data
Types of Data – Types of Variables -Describing Data with Tables and Graphs –Describing Data with Averages – Describing Variability – Normal Distributions and Standard (z) Scores
Unit III: Describing Relationships
Correlation –Scatter plots –correlation coefficient for quantitative data –computational formula for correlation coefficient – Regression –regression line –least squares regression line – Standard error of estimate – interpretation of r2 –multiple regression equations –regression towards the mean
Unit IV: Python Libraries For Data Wrangling
Basics of Numpy arrays –aggregations –computations on arrays –comparisons, masks, boolean logic – fancy indexing – structured arrays – Data manipulation with Pandas – data indexing and selection – operating on data – missing data – Hierarchical indexing – combining datasets – aggregation and grouping – pivot tables
Unit V: Data Visualization
Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and contour plots – Histograms – legends – colors – subplots – text and annotation – customization – three dimensional plotting – Geographic Data with Basemap – Visualization with Seaborn.
Text Books:
- David Cielen, Arno D. B. Meysman, and Mohamed Ali, “Introducing Data Science”, Manning Publications, 2016. (Unit I)
- Robert S. Witte and John S. Witte, “Statistics”, Eleventh Edition, Wiley Publications, 2017. (Units II and III)
- Jake VanderPlas, “Python Data Science Handbook”, O’Reilly, 2016. (Units IV and V)
References:
- Allen B. Downey, “Think Stats: Exploratory Data Analysis in Python”, Green Tea Press,2014.
Related Posts On Semester III:
- MA3354 – Discrete Mathematics
- CS3351 – Digital Principles and Computer Organization
- CS3301 – Data Structures
- CS3391 – Object-Oriented Programming