Overview
Teaching: 60 min Exercises: 0 minQuestions
How are Software and Data Carpentry organized and run?
Objectives
Summarize the history and structure of the Software and Data Carpentry organizations.
Describe at least three similarities and differences between Software and Data Carpentry workshops.
In becoming an instructor for Software or Data Carpentry, you are also becoming part of a community of like-minded volunteers. This section provides some background on both organizations, and on the final steps toward certification.
Preparation and Discussion
This discussion assumes that trainees have read the operations guide (which is assigned as overnight homework). Instead of going through this material point by point, trainers should ask each trainee to add one non-overlapping question to a list, then go through that list.
Software Carpentry was co-founded in 1998 by Brent Gorda and Greg Wilson, who identified a need for best practices training in research computing. After several iterations, the current model of two-day workshops with a standard curriculum emerged in 2010-11. After intermediate support from various organizations, it became an independent non-profit organization called the Software Carpentry Foundation (SCF) in 2015. The SCF is now responsible for all aspects of Software Carpentry’s operations.
History Lesson
For more on Software Carpentry’s history, and on what we’ve learned along the way, see this page on its website or the paper “Software Carpentry: Lessons Learned”.
In 2013, members of the Software Carpentry community identified a need for training aimed at computational novices that would teach researchers how to properly handle their data. This led to the creation of Data Carpentry under the leadership of Tracy Teal. While separate, the two organization share many aspects of their operations, long-term goals, and community structure:
However, they differ in their content and intended audience. Data Carpentry workshops focus on best practices surrounding data. Its learners are not people who want to learn about coding, but rather those who have a lot of data and don’t know what to do with it. Accordingly, Data Carpentry workshops:
Software Carpentry workshops focus on best practices for software development and use. Its workshops are:
Software Carpentry’s most commonly used lessons are:
Lesson | Site | Repository | Instructor guide |
---|---|---|---|
The Unix Shell | Site | Repository | Instructor guide |
Version Control with Git | Site | Repository | Instructor guide |
Programming with Python | Site | Repository | Instructor guide |
Programming with R | Site | Repository | Instructor guide |
R for Reproducible Scientific Analysis | Site | Repository | Instructor guide |
Only one of the three programming lessons (Python or one of the R lessons) is used in a typical workshop. Software Carpentry also maintains lessons on:
Lesson | Site | Repository | Instructor guide |
---|---|---|---|
Version Control with Mercurial | Site | Repository | Instructor guide |
Using Databases and SQL | Site | Repository | Instructor guide |
Programming with MATLAB | Site | Repository | Instructor guide |
Automation and Make | Site | Repository | Instructor guide |
but these are less frequently used.
The main aim of the Unix shell lesson is to familiarize people with a handful of basic concepts that crop up in many other areas of computing:
head
, tail
, grep
, and related toolsThe aims of the version control lesson are to teach people:
The ostensible aim of the programming lessons are to show people how to build modular programs out of small functions that can be read, tested, and re-used. However, these concepts turn out to be hard to convey to people who are still learning the syntax of a programming language (forest and trees), so in practice the programming lessons focus primarily on the mechanics of doing common operations in those languages.
Data Carpentry’s lessons are domain-specific and cover data organization, manipulation, and visualization skills relevant to the target domain. Currently, there are fully-developed workshops for:
There are also materials in development and testing for:
Other Data Carpentry lessons are in the incubator stage.
We have recorded what we’ve learned about writing workshops in an operations guide and a set of checklists (linked from that page) that describes what everyone involved in a workshop is expected to do and why. Questions, corrections, and additions are very welcome.
Since January 2015 we have run bi-weekly debriefing sessions for instructors who have recently taught workshops. In these, instructors discuss what they actually did, how it worked, how the lessons they actually delivered differed from our templates, what problems arose, and how they were addressed. Summaries are posted on our blog shortly after each meeting, and eventually added to our operations guide.
How We Do Things
Go to the operations guide and read the instructions for a regular instructor and for a workshop host. What situations might come up that these don’t answer?
Quoting the Software Carpentry workshop request page:
Our instructors are volunteers, and so are not paid for their teaching, but host sites are required to cover travel and accommodation costs for any instructors visiting from out of town. The Software Carpentry Foundation offers three fee schedules for workshops:
Self-Organized Workshops: Optional Donation
Software Carpentry welcomes you to organize and run your own workshop without administrative assistance from the Software Carpentry Foundation by optional donation. In order to use the Software Carpentry name and logo at your event, we only require that you follow our curriculum, have at least one badged instructor teaching and co-organizing your event, and let us know that you’re organizing a workshop. In order to help Software Carpentry continue operating and offering workshops around the world, we ask for (but do not require) a donation, and recommend $500 USD as a suitable amount.
Nonprofit Organization: $2500
If you are a not-for-profit, such as a university or government lab, the Software Carpentry Foundation will organize a workshop for you (not including instructor travel and accommodation costs) for $2500 USD.
For-Profit Institution: $10000
If you are a for-profit institution, such as a company, the Software Carpentry Foundation will organize a workshop for you (not including instructor travel and accommodation costs) for $10,000 USD of which three quarters is used to underwrite workshops at institutions that could otherwise not afford them.
We strive to be a global project and support diversity in science. If you wish to offer a workshop that would further these goals, please contact us regarding a waiver for the administration fee at the nonprofit and for-profit scales. Waivers are not required for self-organized workshops.
Quoting the Data Carpentry workshops page:
The cost of hosting a workshop is both the Workshop Administration Fee and travel expenses for the two instructors.
Workshop Administration Fee: $2500 US
This is the fee is for non-profit organizations, such as universities and government labs. If you are a for-profit organization, such as a company, and are interested in a workshop, please get in touch.
Partial or full waivers for fees will be considered on an as-needed basis.
Travel Expenses for Instructors: ~$2000 US
All instructors are volunteers, but the Host needs to cover their travel expenses. We work to find local instructors, but suggest that you estimate about $2000 for the travel, food and accommodation of the instructors. The details of how you will reimburse the instructors needs to be established when the workshop is scheduled.
All of Software and Data Carpentry’s lessons materials are freely available under a permissive open license. You may use them whenever and however you want, provided you cite the original source.
What’s Core?
Our learners have such a wide spread of prior knowledge that no one fixed lesson could possibly fit everyone’s needs. We have therefore provided more material than most people will get through most of the time in order to be (reasonably) sure that we have enough for more advanced classes. In particular:
- Callouts (like this one) contain material that isn’t essential to the lesson, and which most instructors will skip.
- Most instructors only give learners one or two exercises per episode; the other exercises are there for self-study.
However, the names “Software Carpentry” and “Data Carpentry” and their respective logos are all trademarked. You may only call a workshop a Software Carpentry or Data Carpentry workshop if:
Software Carpentry and Data Carpentry share a single instructor training program, but instructors must certify separately for each at the end: see the description of the instructor checkout procedure for details.
In order to communicate with learners, and to help us keep track of who’s taught what and where, each workshop’s instructors create a one-page website using this template. Once that has been created, the host or lead instructor sends its URL to the workshop coordinator, who adds it to our records. The workshop will show up on our websites shortly thereafter.
Practice With SWC Infrastructure
Go to the workshop template repository and follow the directions to create a workshop website using your local location and today’s date.
We also have a small installer for Windows to help people set up their environment, which is maintained in this GitHub repository. This installer runs after the installer that puts Git and Bash on Windows, and does the following:
Please see the setup instructions in the workshop template for more details.
There are several hubs of activity for the Software and Data Carpentry communities:
The administration, policies, practices and content of Software Carpentry and Data Carpentry rest on the shoulders of the communities that support them. In the same way that we hope to promote a culture of openness, sharing, and reproducibility in science and research through training researchers with the tools they need, the Carpentry organizations themselves aim to be open, collaborative, and based on best practices. Just as we encourage researchers to use packages and modules in their code, to create re-usable pieces, we want to draw together the collective expertise of our teaching community to create collaborative lessons, share other materials, and improve the lessons via “bug fixes” as we go along. We discuss this in more detail in a later lesson.
While contribution is frequently seen in terms of contributing to specific lessons in either organization, there are many, many ways to contribute and participate in the Software and Data Carpentry communities.
Here are some examples of ways that people have contributed to the community: * Show a discussion thread on one of the pull requests which contains a change in materials. * I also showed how the thread on Discuss list about “Leaving novices behind” turned into blog post. * I would also cover more about how friendly the community is - showing examples of good discussions under pull requests (possibly controversial PRs).
So being part of a friendly, open discussion, is of equal or greater importance to the community than submitting the perfect lesson change. The checkout process to become a fully-fledged instructor will be one way to start connecting to the community and find which area will allow you to contribute best.
Software Carpentry is a democracy: its seven-member Steering Committee is elected annually by and from its membership, which includes every instructor who has taught in the two years leading up to the election. The Steering Committee has final say on all strategic and financial decisions; if you would like Software Carpentry to take a new direction, or would like to do more than teach or develop lessons, you are very welcome to put your name forward as a candidate.
Get Connected
Join our discussion lists, subscribe to our blogs, and follow us on Twitter.
Feedback on Assessment
Go through the pre-assessment questionnaire given to you by your instructor and critique its questions. (Remember, critiquing means commenting on positive aspects as well as negative ones.) How long do you think it will take the average learner to fill it in? How useful do you think the information it gathers will be to you as an instructor? How could you improve the questions? What would you add, and what would you drop to make room?
Key Points
Software Carpentry was founded in 1998 to teach scientists how to program better.
Data Carpentry was founded in 2014 to teach researchers how to handle data.
Their materials are all openly licensed, but their names and logos are trademarked.
They share teaching methods and a common instructor pool.
The workshop operations guide summarizes what they have learned about organizing and delivering training.