Mastery --- Emily Jane

Sep 27, 2012 • Emily Jane McTavish

I found this tricky for two main reasons. 1) I don’t I know how to do most of these things in an advanced way, definitely trying to figure that out! and 2) While I think better computational knowledge is useful for everything, I don’t usually think of it in terms of task based goals. So I think the things I wrote will develop as I think about it more! But presenting direct applications will definitely help learners apply skills.

  1. How do I store my data?
    • Novice: Series of excel spreadsheets
    • Intermediate: SQL or other relational database, Flat files accessed by scripts
    • Advanced: I’m not sure! RDF? Web2.0? Hadoop?
  2. How do I navigate the command line?
    • Novice: Be able to run software written by others
    • Intermediate: Some regular expressions, and writing simple bash scripts or Python programs. Installing necessary libraries, and parsing and dealing with error messages.
    • Advanced: Same, but better!
  3. How can I automate the mundane parts of my work such as formatting input files or pulling out the useful parts out of verbose output?
    • Novice: Do lots of it by hand, write some scripts for issues that occur repeatedly
    • Intermediate: Have dedicated code for these processes, run some error checks
    • Advanced: Well written scripts, thorough error checking.
  4. How can I understand the legacy code that has been handed down in my lab?
    • Novice: Don’t! Just run it and hope it works.
    • Intermediate: Read through it — add some comments to clarify.
    • Advanced: Re-write to make it clearer and more efficient
  5. How can I write code that my collaborators and labmates can understand?
    • Novice: Lots of comments
    • Intermediate: Clear well written code with comments for confusing parts
    • Advanced: Logical object oriented programming with comments where helpful
  6. How can I make a computational pipeline that I can trust?
    • Novice: Cobble steps together and hope it works.
    • Intermediate: Write a script to tie together the process. Run error checking.
    • Advanced: Test boundary cases. Speed up steps that are slowing it down