DATA 301 - Introduction to data science

Table of contents

  1. Syllabus
  2. Schedule
  3. Technology
  4. Project

Syllabus

Course Information

Lecture meeting: Tuesday and Thursday from 9:40-11 AM in 20-140 (Engineering East)

Lab meeting: Tuesday and Thursday from 12:10 - 1:30 PM in 192-206 (Engineering IV)

Professor Information

Paul Anderson, PhD

Email: pander14@calpoly.edu

Office: 222 Building 14

Office hours:

  • Tuesday and Thursday from 1:30 - 3:30 PM
  • Wednesday from 12-1 (online only)

Graders

Grant Bernosky

Course Prerequisites and Goals

Data science is often described as the intersection of computer science and statistics. While data science is much more than this, it is true that you need to know both computer science and statistics to be a successful data scientist. We assume that everyone has taken at least STAT 302/312 and CPE 102.

This class is taught in Python, which for most of you was the language that was used in CPE 101. We will not go over basic Python syntax in this class. For example, I will assume that you know how lists and dicts are used in Python. If you do not know Python or need a refresher, please go through the Codecademy course.

The type of programming you will be doing in this class is likely very different from the programming you do in CS classes. Rather than writing long, complex programs and then testing them, you will constantly iterate between writing code and running it to see what it does.

My goals for the lab and for the class in general are to incorporate as much active learning and learn by doing activities as possible. To accomplish this in class, we will utilize online videos, readings before class, etc. That being said, I will use at least a portion of almost every lecture to introduce the days topics and foster discussion.

This course is not introduction to machine learning or artificial intelligence. We have other classes in this sequence that cover those topics. This course will treat machine learning as mostly a black box activity. Instead the focus is on data manipulation, summarization, processing, pipelining, analytics, etc.

Course Learning Objectives

  1. Identify different types of data used in data science and know their properties

  2. Write programs implementing key data manipulation, management, and analytic tasks

  3. Choose techniques to solve common data analytic tasks, and perform appropriate (simple) data analyses with given data

  4. Visualize, demonstrate and explain the results of data analyses to customers/data owners

Textbook and Other Material

With few exceptions, all of our course content will be available on GitHub inside a GitHub repository: https://github.com/Anderson-Lab/data-301-student.

JupyterHub: https://data301-f19.jupyter.sh

GitHub Classroom Link (please link your Cal Poly email): https://classroom.github.com/a/IgvWJlmt

Grading

Labs: 20% (anything you don’t complete during lab is due before the next lab day)

3 Exams: 30% (10% each)

Final Project: 40%

Reflections: 5%

Participation: 5% (Measured via pop quizzes and subsequent discussion)

Attendance

Attendance will not be recorded each class. Attendance is reflected in your grade by completing the pop quizzes, which will be graded for effort.

Late Policy

No late days will be allowed.

Feedback and Assessment

My goal is to have lab feedback provided to students each week on the prior week’s labs.

Feedback on pop quizzes will be obtained immediately through discussion.

Feedback on the final project will be available for pickup after the quarter is completed.

Feedback on the exams will be provided within a week from completing the exams.

Electronics Policy

No electronics are allowed to be used in the classroom during lecture. This includes phones, laptops and tablets. If you have a good educational reason for using an electronic device during lecture, please see me. Note – if your phone rings or beeps during lecture you should bring in cookies for the entire class.

Grading Scale

Grading Scale: A: 90-100; B: 80-89; C: 70-79; D: 60-69; F: <60. Plusses and minuses will be used at the discretion of the instructor.

Grading Guidelines: Submitted work requires Analysis, Evaluation, and Creation of ideas, concepts, and materials into various deliverables (e.g., see revised Bloom’s Taxonomy and reference below).

  • The grade of A is for work that involves high-quality achievement in all three Bloom areas.
  • The grade of B is for work that involves high-quality achievement in at least two Bloom areas, and medium-level achievement in the other.
  • The grade of C is for work that involves high-quality achievement in at least one Bloom area, and medium-level achievement in the others.
  • The grade of F is for work that does not meet above criteria.

Reference: Errol Thompson, Andrew Luxton-Reilly, Jacqueline L. Whalley, Minjie Hu, and Phil Robbins. 2008. Bloom’s taxonomy for CS assessment. In Proceedings of the tenth conference on Australasian computing education - Volume 78 (ACE ‘08), Simon Hamilton and Margaret Hamilton (Eds.), Vol. 78. Australian Computer Society, Inc., Darlinghurst, Australia, Australia, 155-161.

Feedback will be given as quickly as possible with a goal of within a week of the assignment due date.

Honor Code

Lying, cheating, attempted cheating, and plagiarism are violations of our Honor Code that, when identified, are investigated. Each instance is examined to determine the degree of deception involved.

Incidents where the professor believes the student’s actions are clearly related more to ignorance, miscommunication, or uncertainty, can be addressed by consultation with the student. We will craft a written resolution designed to help prevent the student from repeating the error in the future. The resolution, submitted by form and signed by both the professor and the student, is forwarded to the Dean of Students and remains on file.

Cases of suspected academic dishonesty will be reported directly to the Dean of Students. A student found responsible for academic dishonesty will receive a XF in the course, indicating failure of the course due to academic dishonesty. This grade will appear on the student’s transcript for two years after which the student may petition for the X to be expunged. The student may also be placed on disciplinary probation, suspended (temporary removal) or expelled (permanent removal) from the College by the Honor Board.

It is important for students to remember that unauthorized collaboration–working together without permission– is a form of cheating. Unless a professor specifies that students can work together on an assignment and/or test, no collaboration is permitted. Other forms of cheating include possessing or using an unauthorized study aid (such as a PDA), copying from another’s exam, fabricating data, and giving unauthorized assistance.

Remember, research conducted and/or papers written for other classes cannot be used in whole or in part for any assignment in this class without obtaining prior permission from the professor.

Diversity Statement (Cal Poly official statement)

At Cal Poly we believe that academic freedom, a cornerstone value, is exercised best when there is understanding and respect for our diversity of experiences, identities, and world views. Consequently, we create learning environments that allow for meaningful development of self-awareness, knowledge, and skills alongside attention to others who may have experiences, worldviews, and values that are different from our own. In so doing, we encourage our students, faculty, and staff to seek out opportunities to engage with others who are both similar and different from them, thereby increasing their capacity for knowledge, empathy, and conscious participation in local and global communities.

In the spirit of educational equity, and in acknowledgement of the significant ways in which a university education can transform the lives of individuals and communities, we strive to increase the diversity at Cal Poly. As an institution that serves the state of California within a global context, we support the recruitment, retention, and success of talented students, faculty, and staff from across all societies, including people who are from historically and societally marginalized and underrepresented groups.

Cal Poly is an inclusive community that embraces differences in people and thoughts. By being open to new ideas and showing respect for diverse points of view, we support a climate that allows all students, faculty, and staff to feel valued, which in turn facilitates the recruitment and retention of a diverse campus population. We are a culturally invested university whose members take personal responsibility for fostering excellence in our own and others’ endeavors. To this end, we support an increased awareness and understanding of how one’s own identity facets (such as race, ethnicity, gender, sexual orientation, religion, age, disability, social class, and nation of origin) and the combinations of these identities and experiences that may accompany them can affect our different worldviews.

Disability Accomodations

Any student who feels he or she may need an accommodation based on the impact of a disability should contact me individually to discuss your specific needs. Also, please contact the Disability Resource Center: https://drc.calpoly.edu/content/drc-services.