Modern Programming for Data Analytics - Git and GitHub

Instructor: Shawn T. Brown

Git

  • A complete distributed version control system
  • Invented by Linus Torvalds because other version control systems did not do what he needed for distributed development
  • By far the most widely used modern version control system in the world

Why use Git?

  • Allows you to capture the complete provenance of the development process
  • Projects not tied to a single copy of the software, as long as the repository exists somewhere, everything can be recreated.
  • Ideal for managing a group of developers working on the same project
  • Ideal for managing distributed versions and development
  • Git is open source and freely available

How does Git work?

Adapted from Git tutorial by Pierre Rioux, McGill University

GitHub

  • GitHub is a web-based platform that integrates with Git
  • Provides a complete solution for managing an open-source software project
  • Complete tutorials and guids on how to use GitHub
  • It is worth learning GitHub thoroughly if you plan to be a software developer

Alternatives to GitHub

  • BitBucket - Atlassian source code management platform
    • Integrates fully with the Atlassian project management suite
    • Used in industry and functions much like GitHub
  • GitLab - A DevOps Web-based managment platform
    • Can be cloned to host locally
  • Both of these platforms function quite similarly to GitHub

GitHub Concepts

  • Creating a New Repository
    • README, Licence, gitignore
  • Branching and Forking
  • Pull Requests
  • Issues and Code Review
  • Version Preparation
  • Github.io

Helpful Links