Sunday, September 20, 2009

GC Musings: The Dry Run Effect

Now that The Muppet Movie Game is postponed indefinitely, those of us interested in Game planning will have to go elsewhere to get our fix of behind-the-scenes discussion.

This is the first in an ongoing, irregular series of articles I'm going to write about my own GC experiences. I encourage other GCs to contribute their thoughts as well, either in comments here or on their own blogs.


THE DRY RUN EFFECT
By Curtis Chen, Team Snout

Background

Last Sunday, DeeAnn and I ran the Portland DASH, a one-day walking puzzle hunt in downtown PDX. "DASH" stands for "Different Area—Same Hunt," and we had worked with people in seven other cities to organize this event. For us, it was a much smaller event than we're used to, and we saw some interesting differences.

When Team Snout has run full-weekend, driving Games, we always see a huge "spread" between the fastest teams and the slowest teams. Some people are puzzle fiends; others like to take their time and forgo hints for hours. We do our best to support both styles of play, but in some of our later events, we spent a lot of effort trying to manage the spread and cause teams to finish within a two-hour window. (That may sound like a long time, but the natural spread is eight to ten hours. That's no good when your end party location is only open for six hours on Sunday.)

In 2006, we did a "dry run" (full-scale, on-location playtest) of the Hogwarts Game with three teams, two weeks before the actual event. We had an observer riding along with each team, so we got very detailed data about how they were solving throughout the event. (One team also inconvenienced a couple of young lovers, but that's another story...)

The most interesting thing we observed on the Hogwarts dry run was the complete lack of a typical spread, despite our preparation of several "bonus clues" to keep faster teams occupied. The three dry run teams never let themselves get more than three clues apart at any time. According to our observers, whenever one team saw another team pulling out of a location, the remaining team would suddenly become more motivated to take a hint and speed up their solving of that clue. Nobody likes being left behind.

I've started calling this "the dry run effect," and we recently saw it in action during the Portland DASH.

So Crazy It Just Might Work

During most of the Portland DASH, DeeAnn and I were the only GC staff available. (Another long story.) This meant we had to cover all the tasks: handing out clues at each location, answering the telephone help line, monitoring clue sites, and taking care of any other little crises that came up. We knew we wouldn't be able to handle doing hints by phone if more than two teams called at once, so we decided to give pre-printed hint envelopes with each clue.

Every single one of our teams finished ahead of schedule: we started around 10:15 AM, and the first team hit EndGame around 2:30 PM. Our scheduled hard cutoff time was 4:30 PM, but even the slowest team arrived an hour before that. I'm pretty sure being able to take hints at any time, without having to call GC and admit you were stumped, caused more teams to take hints earlier and more often. But there was another important factor--we ran the event as a relay.

This is, to my knowledge, something that no other GC has done in this type of event. (If you know of someone who has, please tell me; I'd love to compare notes.) Our goal was to make clue distribution possible for a two-person GC to handle. This is how it worked:
  • GC waited at each location for the first team to arrive.
  • When the first team showed up, GC handed all the sealed clues to that team.
  • The team opened one copy of the clue and started solving it.
  • When the next team showed up, the first team handed all the remaining, sealed clues to them.
  • This process repeated for every subsequent team. If a team ever finished solving their clue before the next team showed up to "hold the bag," they contacted GC for further instructions. (We were usually able to return to the location and hold the remaining clues until the next team arrived.)
This worked out pretty well; it even fit the Old West theme, because GC was the "Sheriff" and each team captain was a "Deputy." At the end party, one team told us they really liked this system because it caused them to see more of the other teams throughout the event.

But remember the spread I was talking about? We had seven teams in the Portland DASH, and we never saw them spread across more than three clue locations--the same as in the Hogwarts dry run. I haven't done all the number crunching and statistical modeling, but the following is my intuition about what's going on.

Three in 3

When Team Snout discusses "the spread," we talk about three sub-groups of all the teams that are playing: We have (1) the fast teams, (2) the middle of the pack, and (2) the slow teams. (Please note that none of those terms is intended as pejorative; we recognize that people play at different speeds, and we do not force anyone to conform to a specific timeline. We want everyone to have fun.)

This spreading-out happens in many circumstances, even down to the team level. Think about it: When a Game team arrives at a location, there's always the one guy who jumps out of the van and runs flat-out to get the clue, then the rest of the team who tumble out after it's parked, and finally the driver, who has to lock up. (If it's Sunday morning, there may also be one or two nappers who stay behind.)

With twenty-plus teams, the atomic units become clusters of teams instead of individual team members. But with a smaller number of teams--say, three in the Hogwarts dry run, or seven in the Portland DASH--I believe players recognize each other more easily, and they're more aware of where they are in the pack. If a team thinks they're falling behind, they may think about taking a hint sooner.

Nobody likes being left behind. With only seven teams, you'll know when you've encountered most of them. With twenty teams, there are some you're never going to see, and unless told otherwise by GC, you can always hold out hope that some of them are still behind you; therefore you have less motivation to speed up your solving.

Conclusions

This is just an observation. I don't think this is a problem that requires fixing, but if confirmed, it will be useful for other GCs to know. Your dry runs will not show the same spread as your full event, and if your event is small, you won't see much of a spread at all. This will affect your timeline and staffing requirements. Plan accordingly!

5 comments:

  1. An alternative explanation might be that playtesting in general tends to be a little more mellow than the actual game- there doesn't seem to be as much at "stake", whatever that means. I have no idea how DASH felt up there- but with no points, free hints, and a new genre altogether for the area, it wouldn't surprise me if it had a similar feel to a Bay Area dry run.

    I've been in 2 dry runs (Paparazzi and Hogwarts), and neither time do I remember being especially interested taking hints because of what the other teams were doing. We took hints when we were banging our heads against walls and making no progress.

    Also, in the Hogwarts playtest, I think all 3 teams were more or less evenly matched, so the spread was probably going to stay smaller because of that more than anything else.

    ReplyDelete
  2. Your theory seems to be that with fewer teams, the spread will be less, because there's more awareness of your standing?

    That's an interesting theory, but I'm doubtful. Like Greg, I think it's more about the fact that playtests have a very different psychological feel to them, and that you rarely have a curve-breaking uber-team on a playtest.

    (Also, statistics can be deceptive. The expected maximum for a sample of 25 normally distributed values is almost 2 sigma; for a sample of 3 it's only ~0.75 sigma.)

    Anyway, if your theory is correct, that means you'd see a smaller spread if teams were more aware of their position? It would be interesting to see if the type of online leaderboard used in BANG 25 has any effect on player behavior.

    ReplyDelete
  3. Thanks for the comments!

    Greg: You're right about playtests being more mellow than actual events; it definitely reduces the competitive tension a little. As for teams being "evenly matched," I would argue that psychology is more important than ability in determining how quickly clues get solved, because you can always choose to take a hint.

    Dan: Thanks for bringing the math. :) I would be very curious about the BANG 25 data, if that ever gets published. However, seeing a number on a scoreboard is very different from seeing a team pass you by in person. I'd argue that the latter is much more demoralizing and thus motivating.

    ReplyDelete
  4. In the Portland DASH, the relay concept might have been a subconscious glue that held teams together. The lead team at a clue might be less willing to take hints to forge ahead and create a gap, knowing in the back of their mind that it may inconvenience GC a little. The trailing team might feel opposite - they never want to lose sight of the team in front for the same reason.

    As for Hogwarts, I'd agree with Greg's math logic, plus small number statistics. With one sampling of 3 teams, its very probable to end up with 3 evenly matched teams.

    ReplyDelete
  5. Matching in playtests is interesting. In Seattle (For the MS intern events, at least), organizers try to deliberately unbalance the playtest- coming up with a "fast team" and a "slow team", and using the difference to get a very vague idea of how much spread to expect.

    Though there's certainly more of a need for it in those cases. The expected spread is even broader than most other events, as most of the interns have never seen an event like this before but there are a few who've won a Mystery Hunt or two. (And you have a bigger pool of potential playtesters, because you have all the MS employees who know they can't participate any other way)

    ReplyDelete