When I was a lowly intern at OTI way back in 1999, my supervisor asked me to design and develop a plugin for their brand new extensible Integrated Development Environment called Eclipse.
Every term (by the university calendar), they’d get a batch of a dozen or so interns. Each one would have to write a plugin within the span of 2 weeks. I haven’t heard of any instances where interns failed to produce one.
Fast forward to today: Most university students I run into find it exceedingly difficult to get started writing an Eclipse plugin. What changed?
The code base has certainly evolved since the 1.0 release. However, the amount of documentation and sample code has also increased tremendously.
More importantly, at OTI, we had access to a significant level of resources. We had slide decks describing the architecture and design philosophy of the system. We had tutorials about how to use the tools to do things the Eclipse way. We had physical access to 90% of the people who wrote the code we were extending. (The other 10% were the Java tooling folks; whenever we asked the local senior developers where they sat, they’d tell us to go down the main hall, past the receptionist, turn left onto the street, wait for a bus to the airport and take the first plane to Amsterdam)
We drank their Kool-Aid from a fire hose.
Availability of resources has a significant effect on how quickly novices and experts familiarize themselves with a software system. This is obvious.
What isn’t obvious is what people actually do when these resources are constrained. What do novices do when they want to contribute to an open source project (say, for example, Google’s Summer of Code)? How do they get started? Mentors prefer students that are already familiar with the code base because that demonstrates their independence and potential for getting their project finished. How do students get themselves to that point? What works? What doesn’t?
I think there are differences in the strategies novices and experts use to familiarize themselves with a software system. One way of categorizing the various cases is to group them by expertise and the level of resource support they have:
| Expertise |
Available Resources |
| Artifacts Only |
Other Learners |
Artifact Creators |
| Novice |
N1 |
N2 |
N3 |
| Expert |
E1 |
E2 |
E3 |
…where Artifacts Only means you can dig through code, documentation, diagrams and the like but you’re investigating the problem on your own; Other Learners means you have access to other people trying to investigate the same thing but they may be just as lost; Artifact Creators means you can ask questions of the people who made the artifacts and (hopefully) get enlightening answers.
I’m currently digging through psychology and software engineering research to see where the various studies fit into this grid. So far, most of the research I’ve seen falls under N3 and E3. I’ve yet to find empirical studies that fall under the rest. I’ve come up with some examples for each of the other categories:
- N1: GSOC participants
- N2: University students performing change tasks on large software systems
- E1: Eclipse Bug Day participants
- E2: Teams that inherit code bases from other teams that have been dissolved (with little or no opportunity for knowledge transfer)