The original Baseline Test was used to establish the human performance and user satisfaction levels for the existing site. We also used the baseline test to help us understand some of the major usability issues that would help to guide future changes to the new homepage. This was consistent with our usability and user-centered design approach.
The Baseline tests took place in August 2006 at three different locations in the United States (Washington DC; Atlanta, Georgia; Ogden, Utah). All were in-person usability tests, and were conducted in government usability labs. Because of the type of data we were collecting, we elected not to use remote usability testing.
The participants were both federal employees and people who had no affiliation with the government. We tested a total of 68 participants that included:
- Public Health Professionals
- Healthcare Workers (Physicians, Nurses, etc.)
- General Consumers
- Researchers and Scientists
- Journalists, Legislators, Students
The participants had a mix of gender, age, education, race and Internet experience that matched typical users of CDC.gov. The usability testing sessions were about one hour long, and were conducted using Keynote’s WebEffective, and Techsmith’s Morae.
The User Experience Team created 36 scenarios that reflected the tasks performed most frequently on CDC.gov. Considerable time and effort went into identifying the most frequently performed tasks on the website. Information was used from interviews, surveys, reports from the call center, evaluations of web logs (Omniture), ACSI results, and other usability sources.
Each participant dealt with 10 scenarios. All participants were told to use the CDC.gov website to find the correct answer, and not to use the website’s search capability. Later, a ‘search-only’ test was conducted to determine the impact (if any) of not allowing users to search. There was no reliable difference in success rates, average time, average page views or satisfaction scores.
|Average Page Views||
|Satisfaction (out of 100)||
Participants were instructed to work as quickly and accurately as possible. They answered pre-scenario questions, responded to several task scenarios, and then answered post-scenario questions (including a satisfaction metric: the System Usability Scale – SUS). If they did not find the answer to a scenario question within three minutes, they automatically were moved to the next scenario, and were considered as ‘unsuccessful’ on that scenario.
The overall success rate across all 36 scenarios was 54%. In other words, users were able to successfully complete only about half of the scenarios in the allowed three-minute time limit. The figure below shows the success rates for 11 of the scenarios.
The task scenarios are shown using their ‘short description’. This description was used to facilitate reading of the charts. To provide a better understanding of the scenarios used in our testing, 11 of the task scenarios are described below:
- Vaccines: Meningococcal conjugate vaccine (MCV4) should be administered to all children at age 11-12 years. What is the requirement for college freshmen living in dormitories?
- Smoking: Every day in the United States, what number of young people (ages 12-17) try their first cigarette?
- HPV: CDC’s Advisory Committee recommends that girls should receive this vaccine at what age?
- OlderFalls: What percent of adults over age 65 have a serious fall each year?
- ADHD: What percent of children between the ages of 4-17 years have been diagnosed with Attention Deficit Hyperactivity Disorder (ADHD)?
- CO: CDC recommends having your heating system, water heater and any other gas, oil, or coal burning appliances serviced by a qualified technician. How often should you do this?
- Asbestos: Asbestos fibers are so small they cannot be seen. When do these tiny fibers become dangerous?
- Mold: After a major hurricane, many homeowners return home to find that their home has been flooded. How much bleach should you use to clean up the mold?
- TravelShots: When traveling to South Asia, there are several recommended vaccinations you should get, including vaccines for: Hepatitis A and B, Malaria, and Typhoid. If you plan to go camping, hiking or bicycling, what other recommended vaccine should you get?
- BMI: The Body Mass Index (BMI) is a number calculated from a person’s weight and height. A person is classified as overweight when their BMI is over what number?
- FAS: Fetal alcohol syndrome (FAS) is one of the leading known preventable causes of mental retardation. What is the best way to eliminate this type of mental retardation?
Using information gained during the Baseline Test, we focused on making changes to those scenarios that elicited the poorest performance. For those scenarios shown above, this would be those that had success rates of 64% or less. As you can see, some of the scenarios elicited very good performance. We encouraged the design team to not make changes that would lower the success rates of those scenarios. We found it necessary to continuously remind designers to stay focused on those scenarios where the success rate was lowest. This was one of the major uses of the usability testing sessions – it kept designers focused on those homepage issues that most needed their attention.
It should be noted that the qualitative findings from the usability testing were used to inform the proposed changes to the homepage. We had participants type their overall impressions, specify what they liked best and least, and what changes they would make if given the chance. During the tests, testers took notes about problems that participants were having, and asked questions in a debriefing at the end of each test.
Some of the observations after the Baseline test included:
- Many felt that the homepage had too much information (overwhelming).
- Participants struggled to find information because of busy, cluttered pages.
- Participants who found the A-Z index liked it, and used it quite frequently (it was hard to find).
- Participants thought that the website was inconsistent in layout, navigation, and look and feel.
- Participants did not feel that the categories of information were clear.
- Participants thought that they had to go through too many layers to find information.
- Participants did find that the features and the page descriptions were useful.
We found these observations to be invaluable after deciding which scenarios were leading to the most usability problems. These insights from both testers and users assisted us in deciding what changes had the best chance of improving the website.