Category: Requirements

Textual Analysis: Moving from Use Cases to Classes

Early on in the development process, most of the information gathered from future users consists of documents (e.g., forms, policy/procedure) and interview data (e.g., feature descriptions, desired functions, specific dislikes, wish-list items, etc.). As a software designer using object-oriented design, the next major step in making this information actionable is capturing how users will interact with the planned software via use cases.   Once an exhaustive set of uses is in hand, the next step is converting the use cases into a conceptual map of the domain, which is done by identifying candidate classes. Further refinement of the candidate-class pool will yield a final set of classes that can be used to build the application. Textual analysis is a standard technique for generating candidate classes from use cases (1), and it is what I am using for this project.

Richard Abbott introduced textual analysis as a technique for software design in the 1983 article, “Program Design by Informal English Descriptions” (2). Textual analysis relies on the structure of natural language to find items that could be adapted for classes and methods. While Abbott was not doing OOA&D, the technique works just as well for this purpose, as well as others. For example, the same technique can be used to abstract information needed for modeling workflows from narrative descriptions of clinical processes.

In the paper, Abbott describes a specific strategy for analyzing narrative.

Having developed an informal strategy, the next step is to formalize that strategy. The formalization steps are:

  1. Identify the data types.

  2. Identify the objects (program variables) of those types.

  3. Identify the operators to be applied to those objects.

  4. Organize the operators into the control structure suggested by the informal strategy.


We identify the data types, objects, operators, and control structures by looking at the English words and phrases in the informal strategy.

  1. A common noun in the informal strategy suggests a data type.

  2. A proper noun or direct reference suggests an object.

  3. A verb, attribute, predicate, or descriptive expression suggests an operator.

  4. The control structures are implied in a straightforward way by the English.


Let’s look at the use case (updated) presented in an earlier post.

Use Case Name: Add Patient Profile
Description: Describes the process for adding a new patient to prn: CIM-OnCall

  1. New patient calls
  2. Clinician accesses app
    1. Selects “New Patient”
    2. Enters patient’s Last Name
      1. System displays all patients with same or similar last names along with MR#
      2. Clinician reviews list and determines whether patient is already in system
        1. If in system
          1. Selects patient
        2. If not in system
          1. Resume adding demographics
      3. Add First Name
      4. Add MedRecNo
        1. Add location for MedRecNo (office, hospital, etc.)
        2. System checks if MedRecNo already in database for location given and patient with same Last and First Names
          1. If in System
            1. Flag patient for review of possible duplicate
          2. If not in System
            1. Resume adding demographics
    3. Add Gender
    4. Add Date of Birth
    5. Add Phone Numbers
    6. Add Email Addresses
    7. Add Introductory Note (optional)
    8. Save patient
      1. Patient added to database

I have highlighted the first instance of each noun (right now I am only interested in finding candidate classes) that might be a class for prn: OnCall.

In going over this list, there are a few obvious candidates for classes. Clinician is the main user type for the application.   Patients are stored in the system as well as information about patients. Demographic data (e.g., Gender, DOB, etc.) capture information about specific patients, and may be either classes (Email Addresses) or properties (Gender). Locations are recorded in relation to MedRecNo. Introductory note and database are the final class candidates.

Additional use cases would enlarge the set of nouns that would be used to fill out the candidate-class list.   We know from the project scenario that the clinician will be able to keep a record of the patient’s medications and problems. From this information, we know there are use cases for managing these items (e.g., Add Medication, Update Medication, Add Problem, etc.).
Here is a list of what I consider the most important use cases for the app.

Use Case List
Add Clinician Profile
Update Clinician Profile

Add Patient Profile
Update Patient Profile

Add Location
Update Location

Add Medications
Update Medications

Add Diagnosis
Update Diagnosis

Add Episode Note
Update Episode Note

Add Disposition
Update Disposition

Candidate Classes (so far)
Clinician (App user)
Email Address (may have email addresses tied to different roles/locations — home, work, emergency contact)
Phone (same concerns as email)
MedRecNo (same concerns as email)
Location (tied to where patient might be or go – ER, consultant, pharmacy, etc.)

If this were a large project with a large development team, this would be the point at which I would create a domain model that showed potential classes and how they were linked.   Domain models are great for discussing the scope and behavior of an application. They allow everyone involved to get a feel for what the application is supposed to do. Refining the domain model results in a class diagram that can be used to write code.

At this point, everything remains conceptual, as we have had little discussion of class properties and methods.   Required properties and methods will come out with further analysis of use cases and documentation (forms, manuals, etc.) already gathered. Verbs in use cases are the starting point for methods.   The verbs Add, update, and enter offer hints for methods. Analogously, relationships between classes can be gleaned from observing that email addresses are tied to roles/locations, and patients may have an unlimited number of addresses.   Similarly, hints of inheritance hierarchies are evident from knowing that there are different types of locations and that much of the same information is needed from all of them (name, address, type, contact info, etc.).

In the next post, I will give the final class list along with a description and rationale for each.

  1. McLaughlin BD, Pollice G, West D.Head First Object-Oriented Analysis and Design. 2006
  2. Abbott RJ. Program design by informal English descriptions. Communications of the ACM 1983;26(11):882–894.