Courting Eliza: February 2013

Thursday, February 28, 2013

The User Experience Team

After researching and writing my last post on Strategic UX Design, I started thinking more about how I've spent my time at the various organizations that I've worked for. Early in my career I spent a good bit of time as a UI developer and later as a researcher and project manager for UI-intensive projects.

If I map my experience onto the 5 Competencies of UX Design diagram from UX matters, I've heavily invested my time into Interaction Design and Prototype Engineering with quite a bit of Information Architecture recently, and scattered, intense periods of Usability Engineering and Visual Design. In terms of UX Strategy, I've some time on it at the beginning of each project. However, it's often hard to maintain a focus on strategy within the context of the daily grind of software design and development.

It's not surprising to note that my most successful and productive periods of UX work have come while working within a good team. When two people are working well together, it can more than double your productivity. The same can be true (to a lesser extent) for larger teams if you have quality people and good leadership.

In some of my past projects, I've been lucky enough to work with trained and experienced UX professionals. In other cases, I was able to recruit and develop folks with other backgrounds such as documentation, testing, UI software development, business analysis and graphic design.

In this post, I want to capture my model of the roles a good UX team. Often two or more of these roles are filled by a single person (based on resources available for UX design), but to the extent that the roles can be filled by quality specialists (perhaps borrowed from other groups in the organization), you can expect to see corresponding performance improvements.

UX Lead: (objectives, strategy, total customer experience, interface with other leads [dev, qa, marketing, sales, senior management?], UX process and integration, manage team, prioritization)
Business Analyst/Research: (understand and document business needs/rules/requirements, research and document information architecture and terminology, help develop value map, interface with customers, other business analysts, documentation team, marketing and sales)
Interaction and Visual Design: (UI logical framework, look and feel standards, contribute to UI specifications/prototypes for each UI element (screen, control, etc.), ensure designs meet measurable objectives, interface with marketing and dev)
Usability Testing/Research: (set up and run usability tests/research, document findings, track and document usability issues / feature requests, interface with QA team, overlap with UI functionality testing, ensure that UI conforms with specs and standards)
Prototype/UI Engineering: (develop UI framework and reusable components, rapid prototyping of new designs, help translate new designs into production code, interface with dev)

Note: I referenced the excellent resource, 5 Competencies of UX Design, as I was writing this up. You should expect to see a good amount of overlap.

Not sure if this doodle adds anything to the post, but it was fun to draw. :-)

Wednesday, February 27, 2013

Strategic UX Design

Recently, I've been thinking quite a bit about strategic UX design. I'm sure there are many ways to think about this, but here's my cut:

Tactical UX Design is about using specific tools and techniques to design a particular user experience. UX matters provides a nicely organized list of UX techniques and work products in their 5 Competencies of UX Design diagram (see below).

Nicely organized list of Tactical UX Design skills and work products from UX matters

Strategic UX Design is where an organization decides how to apply its limited resources to achieve some effect. This should be a measurable effect (i.e., key performance indicator) such as an impact on sales, improved user satisfaction, reduced training, improved throughput/task completion, error reduction, etc. I think Leisa Reichelt's diagram (see below) does a nice job of describing the relationship between business strategy, UX strategy and UX tactical execution.

A model of the relationship between business strategy and UX from Strategic UX.

From my perspective, the key UX strategy activities include the following:

Setting the Scope and Objectives: It's worthwhile to spend a little time thinking about who the "Strategic UX Thinker" is reporting to (CEO, CTO, Project Manager, Product Manager, etc.), what their goals are, and how they and their goals relate to the rest of the organization. UX is often housed in the development department, but it requires input and has an impact on other parts of the business including sales, marketing and other products.
Understanding the Customers and their Contexts: This step identifies the "big picture" of the software experience. What kind of situations are the users in when they are using the software? How are they feeling and how do we want them to feel? What challenges are they facing and how can we help? Here is where you should be able to identify "big picture" issues and opportunities that a company can address. Contextual Design, Personas, and Value Maps are three key techniques for capturing and communicating this information.
Teamwork and Process: I've been in enough organizations to know that UX can work in just about any kind of team and software development process. Key strategic UX issues include:

How are the UX objectives prioritized relative to other product objectives and are there sufficient resources to meet the short-term and long-term UX objectives?
What is the organizational structure and development process and how do they evolve over time to meet new needs?
How is organizational knowledge of the customer captured, communicated and utilized within the organization?

Well, that's it for this post. If you're interested in reading more, here's a nice list of other resources on the topic of Strategic UX Design:

Tuesday, February 26, 2013

A Simple Scala Internal DSL

My earlier posts have talked about why I'm using Scala to create a DSL for documenting user interface procedures. I also introduced Sikuli and posted a simple Java program for capturing on-screen images and then later using the Sikuli vision library to find them on the screen and click them.

In this post, I will model a very simple Internal DSL in Scala that can execute this script:
Click the "OK".button;

As you can see, as is the case with internal DSLs, I couldn't quite get the syntax the way that I wanted, but it's pretty close. A lesson learned is that you can use internal DSLs in situations where you aren't too picky about the syntax. In my next post I will share an external DSL that has the exact syntax that I want (and some other benefits also).

Here's how to read the code (at the bottom of this post):

Imports
I create a Scala object (MehtodRunner) that extends App (basic Scala stuff)
I create the Sikuli screen object (main object for executing Sikuli commands)
I create the "Click" object (the first word in my syntax example above)
I define a "the" method (second word in my syntax) which takes in a single Component as a parameter.

When called, it creates an image filename and checks to see if the file exists or not.
If it exists, it asks Sikuli to find a matching image on the screen and click it.
If it does not exist, it asks Sikuli to capture the image and store it for the next time. See my last post for a more in depth explanation of how this works.

The next section converts the "OK".button syntax into a Component object. It's basic Scala Internal DSL stuff. For more information on this topic, take a look at Designing Internal DSLs in Scala by Debasish Ghosh.
There is a section that does the work of capturing screenshots in Sikuli (called from "the" method above). Again, see my last post for an explanation.
Finally, I embed the Click the "OK".button; script inline. This is one of the benefits (and drawbacks) of an internal DSL.

----

import org.sikuli.script.ScreenHighlighter
import org.sikuli.script.Screen
import org.sikuli.script.Location
import org.sikuli.script.CapturePrompt
import org.sikuli.script.Observer
import org.sikuli.script.Subject
import java.io.File
import org.sikuli.script.Pattern
import javax.imageio.ImageIO

object MethodRunner extends App {

// =============================
// Sikuli screen object
var screen = new Screen()

// =============================
// Script Actions
object Click {
def the(in: Component) {
println("Click the " + compDesc(in))

var filename = safeFilename(in);
println(filename)
var file = new File(filename);

if (file.exists()) {
screen.click(new Pattern(filename));
} else {
captureComponent = in
cp.prompt("I don't know what the '" + compDesc(in) +
"' looks like. Please select it.");
}
}
}

// =============================
// this supports the syntax "OK".button
implicit def toComponentBuilder(in: String) =
new ComponentBuilder(in)
class ComponentBuilder(name: String) {
def button(): Component = {
return new button(name);
}
}

abstract class Component(val name: String)
class button(name: String) extends Component(name)
def compDesc(in: Component) = in.name + "." +
in.getClass().getSimpleName()

// =============================
// capture and store screenshots for Sikuli
var cp = new CapturePrompt(screen)
cp.addObserver(CaptureObserver)
var captureComponent: Component = null

var USER_HOME = System.getProperty("user.home");
def safeFilename(comp: Component) = USER_HOME +
"/Method/img/cache/" + comp.name.replaceAll("\\W+", "_") +
"_" + comp.getClass().getSimpleName() + ".png";

object CaptureObserver extends Observer {
def update(s: Subject) {
var img = cp.getSelection()
ImageIO.write(img.getImage(), "png",
new File(safeFilename(captureComponent)));
cp.close()
}
}

// =============================
// The Script

Click the "OK".button;

}

Tuesday, February 12, 2013

Sikuli Trick: Capturing images with the Sikuli Java library

In my last post, I introduced Sikuli, talked about its capabilities and power, and posted a very small code sample that showed the basics of using the Sikuli Java library.

In that code sample, I showed how you could use Sikuli to load an image, search for it on the screen, and click it. However, there is a big gap in that workflow -- if everything is based on images in Sikuli, where do all of those images come from?

Luckily, Sikuli provides a very nice capability to support the selection, capture, and storage of images from the screen. If you use their UI, you get it for free. It's also available in the Java library, but it's not well documented, so I'll post it here for future Googlers to find.

The interesting difference between the code below and the code from my last post happens when the image file does not exist (!imageFile.exists()). The CapturePrompt.prompt() call freezes the screen and gives you a crosshair to make a selection. This selection is then passed as a ScreenImage object to the update() callback. I retrieve the Image object and save it as a file. This happens the first time the program is run. Each subsequent time the program is run, it will find the file and then click it.

In my next post, I'll tie this back to the Scala DSL that I posted about on Jan 30 and Feb 4 to make an English-like language for describing procedures that the computer can execute.

----

import java.io.File;
import java.io.IOException;

import javax.imageio.ImageIO;

import org.sikuli.script.CapturePrompt;
import org.sikuli.script.FindFailed;
import org.sikuli.script.Observer;
import org.sikuli.script.Screen;
import org.sikuli.script.ScreenImage;
import org.sikuli.script.Subject;

public class SikuliExample2 {

public static void main(String[] args) {

// Create a Sikuli Screen object.
Screen screen = new Screen();

// Specify an image that we'd like to click
String USER_HOME = System.getProperty("user.home");
String buttonImage = USER_HOME + "/Sikuli/OK_Button.png";
final File imageFile = new File(buttonImage);

// If the image file doesn't exist, screen grab
// Note, this only needs to be done the first time
if (!imageFile.exists()) {
final CapturePrompt cp = new CapturePrompt(screen);
cp.prompt("'OK_Button.png' not found. Please select it.");
cp.addObserver(new Observer() {
public void update(Subject arg0) {
ScreenImage img = cp.getSelection();
try {
ImageIO.write(img.getImage(), "png", imageFile);
} catch (IOException e) {
e.printStackTrace();
}
cp.close();
}
});
} else {
// Find the button and click it.
try {
screen.click(buttonImage);
} catch (FindFailed e) {
System.out.println("Couldn't find: "+buttonImage);
}
}
}
}

Friday, February 8, 2013

Sikuli: An "On Screen" Computer Vision Library

In my last post, I briefly mentioned Sikuli as a tool that I've been using in combination with other technologies for usability analysis and automated testing. However, before I get deeper into how I've been using it, this post will give my overview on what Sikuli is and why I think it is so useful.

If you go to the Sikuli website (http://www.sikuli.org/), you'll see Sikuli described as a tool for either: 1) running macros, or 2) automated software testing. It can do both of these things quite well, but it's important to note that Sikuli is much deeper and more powerful than other tools that can do this.

From my perspective Sikuli consists of several parts:

Open CV: Sikuli utilizes the Open CV computer vision library. This is where the real power comes from. Unlike traditional macro and software testing tools, Sikuli is based on vision. If you can see it on the computer screen, so can Sikuli. It's not perfect (text recognition is a weakness), but it can handle situations that other tools can't. For example, you can write a Sikuli script to play a game that is written in Flash (or Silverlight, or HTML, or C++) even if it doesn't have an API. As long as there are things to see and react to, Sikuli can be used.
CV Tuned to the Computer Screen: This isn't really a separate 'part' per se, but it's so important that it's worth it's own bullet. Computer vision is such a broad and deep topic that an average joe that tried to download Open CV and use it for recognizing something on the computer screen would have a lot of learning and work to do. The creators of Sikuli at the MIT User Interface Design Group did all of this work and bundled it into Sikuli for you.
Java API: The base implementation of Sikuli is written as a Java library. You can bundle this into any Java application. It's my preferred method of using Sikuli.
Custom UI and Jython: This layer is a nod towards usability for non-programmers. There are some cool features here, but as a programmer, it doesn' really fit what I'm trying to do.

Below I've provided code that shows the simplest possible example of using the Sikuli Java API to do something. Here's what it does:

First, I provide the location of a screenshot of an OK button that I'm going to ask Sikuli to find and click for me. In future versions of this program, I will check right here that the file really exists (and if it doesn't exist, I will have Sikuli help me create it).
Next, I create an instance of the 'Screen' class. It's the starting point for most of Sikuli's functionality.
Finally, I have a try-catch block where I ask Sikuli to find something that looks like the "OK_button.png" on the screen and click it. It might fail for two reasons: a) the file doesn't exist, or b) it exists, but there is nothing on the screen that looks like it.

This is a pretty simple program that doesn't really show the full power of Sikuli, but I wanted to lay out the basics before I start to get into some really cool stuff in my upcoming posts...

----

import org.sikuli.script.FindFailed;
import org.sikuli.script.Screen;

public class SikuliExample1 {

public static void main(String[] args) {

String USER_HOME = System.getProperty("user.home");
  String buttonImage = USER_HOME+"/Sikuli/OK_Button.png";

  Screen s = new Screen();

  try {
   s.click(buttonImage);
  } catch (FindFailed e) {
   System.out.println("Couldn't find image: "+buttonImage);
  }

}
}

Monday, February 4, 2013

Documenting Use Cases / Tasks / Procedures

One of the fundamental ideas in the design of tools (software or otherwise) is that you must start with an understanding of the purpose of the tool -- 1) the people who will use it, 2) their goals, 3) the context, and 4) the process of use. In software and other complex systems, capturing, organizing and communicating this knowledge can be one of the most difficult parts of the work.

There are numerous techniques for gathering this data -- typically observation and interview-based -- drawn from job analysis and ethnography. It's a great topic, but this article is about the next step, organizing, communicating and utilizing the data after you've gathered it.

In particular, this article is about notation. Once you start to gather large amounts of data about your users, who they are and what they are doing -- how do you effectively document that data so that it can be understood and utilized for your analysis and design work?

Everyone who designs software has to grapple with this problem. One simple approach used in extreme programming (and ad hoc programming) is to rely on one or more experts that have the knowledge (or can get it on demand) and can answer questions as they arise and provide quick feedback on designs as they develop. More traditional software engineering utilizes formalisms such as UML. Somewhere in between you have semi-formal methods such as Contextual Design.

If you do choose to write down information about the users and their tasks and goals, you can find a number of good references to help you decide exactly what to capture, but I think it boils down to a few basic elements based on the 5 Ws.

Who will be using your software? (actor)
What are they trying to do and why? (goal)
How do they do it? (procedure/method)
Where and when do they do it (context/selection rules)

As I described in an earlier post, I'm particularly interested in representing this data in Domain Specific Languages (DSLs) such as GOMSL. For the past several months I've been working on my own implementation of a DSL for documenting procedures. I want my language to be as human-readable as possible, so I've based it on procedure writing standards described in the Microsoft Manual of Style for Technical Publications.

Basic Form
A procedure starts with a single line in the following pattern: "To :" and is followed by one or more steps. Each step starts with an asterisk(*) and ends with a period(.). A simple step has the following pattern:
* the "" .

Here's a small example use case:

To Open_About_Box:
   * Click the "DST-SM" icon.

There are more complex variations of a step to deal with things like looping, conditionals and calling of other methods, but this post is getting a bit long so I'll just end it here for now. Stay tuned for some longer examples and some sample code that can parse this language and do some cool things like drive the UI and calculate some usability metrics.

Courting Eliza