D2: Technology
Installing software and setting up our environments can take time. Please don’t hesitate to help those around you if you move a little faster.
Slack
Slack tends to be popular with tech-savvy employees, for instance. It is because most IT decision-makers are trying to accommodate their engineering teams and others who love Slack, while also trying to standardize the rest of the company around Microsoft Teams, because it integrates so nicely with Office 365. ref
- Questions on our use of it in class?
R and Python
If you work in data science or analytics, you’re probably well aware of the Python vs. R debate. In many ways, the two open source languages are very similar. Free to download for everyone, both languages are well suited for data science tasks — from data manipulation and automation to business analysis and big data exploration. The main difference is that Python is a general-purpose programming language, while R has its roots in statistical analysis. Increasingly, the question isn’t which to choose, but how to make the best use of both programming languages for your specific use cases. ref
VScode for data science
Visual Studio Code (VSCode) is the most popular development environment, period. If you need proof, here’s the Stack Overflow Survey from 2019 ref
You can review our class guide directions.
- Install VScode
- Setup the interactive Python environment
- Setup the R environment
Git and Github
It is no exaggeration to say that git (and other forms of version control software) underlie the entire world of open source software, and are central to the operation of nearly every tech company on the planet. … OK, now the bad news: there’s a reason I put so much energy into discussing the value of git before getting into how it works: learning git kinda sucks. I mean, it’s not painful like performing an appendectomy on yourself without anesthesia, and it’s not hard like quantum mechanics or geometric topology; it’s definitely something anyone can learn. ref
You can review our class guide directions on Git and Github.
- Install Git
- Create Github account
- Connect to our org (see Google sheet link to input your username)
- Build base for your first personal data project
What is next?
On Monday, we will review Git/Github and our standard workflow. We will then dive into topics around R for data visualization.
Understanding questions
- How many lines of code have you written using dplyr?
- How many lines of code have you written using ggplot2?
- How long would it take you to recreate this graphic?
- How much time have you spent munging messy data?
- Have you used Git(hub) previously?