New York University

4 Washington Place, 2nd Floor, Manhattan (for Python) and
1 MetroTech Center, 19th Floor, Brooklyn (for R)
Mar 17-18, 2014
9:00 am - 4:30 pm

Instructors: Elena Glassman, Matthew Lightman, Michael Selik, Sarah Supp, Tracy Teal, Alex Viana, David Warde-Farley

What: Our goal is to help scientists and engineers become more productive by teaching them basic computing skills like program design, version control, testing, and task automation. In this two-day bootcamp, short tutorials will alternate with hands-on practical exercises. Participants will be encouraged both to help one another, and to apply what they have learned to their own research problems during and between sessions. Attendants are offered online office hours: regular events to get one-on-one help from Software Carpentry instructors, online.

Who: The course is aimed at postgraduate students and other scientists who are familiar with basic programming concepts (like loops, conditionals, arrays, and functions) but need help to translate this knowledge into practical tools to help them work more productively.

Requirements: Participants must bring a laptop with a few specific software packages installed. (The list will be sent to participants a week before the bootcamp.)

Content: The syllabus for this bootcamp will include:

Contact: Please mail admin@software-carpentry.org for more information.


Locations

Python will be at the Manhattan campus: 4 Washington Place, 2nd Floor.

R will be at the Brooklyn campus: 1 MetroTech Center, 19th Floor.
Please refer to the NYU R room's webpage for more details.

Setup

The Bash Shell

Bash is a commonly-used shell. Using a shell gives you more power to do more tasks more quickly with your computer.

Mac OS X already includes a bash shell in the Terminal application. For Windows, we suggest you download and install msysGit, which includes both a bash shell (Git BASH) and git itself.

Git

Git is a state-of-the-art version control system. It lets you track who made changes to what when and has options for easily updating a shared or public version of your code on github.com.

You can follow the git setup instructions on GitHub.

Editor

When you're writing code, it's nice to have a text editor that is optimized for writing code, with features like automatic color-coding of key words.

Sublime Text is a good editor that works on Windows, Mac OS X, and Linux.

Python

Python is becoming more and more popular in scientific computing, and it's a great language for teaching general programming concepts due to its easy-to-read syntax. We will be using Python version 2.7. Installing all the scientific packages for Python individually can be a bit difficult, so we recommend using an all-in-one installer.

We suggest you download and install Anaconda from Continuum Analytics. When installing, choose the option to make Anaconda your system's default version of Python.

R

R is a programming language that specializes in statistical computing. It is a powerful tool for exploratory data analysis. To interact with R, we will use RStudio, an interactive development environment (IDE).


Topics

The Unix Shell

  • What and Why
  • Files and Directories
  • Creating Things
  • Pipes and Filters
  • Loops
  • Shell Scripts
  • Finding Things

Version Control With Git

  • A Better Backup
  • Branching
  • Merging
  • Collaborating

Databases

  • Selecting
  • Removing Duplicates
  • Filtering
  • Calculating New Values
  • Ordering Results
  • Missing Data
  • Aggregation
  • Grouping
  • Combining Data
  • Creating and Modifying Tables
  • Transactions
  • Programming with Databases

Reference Guides