Wednesday, May 15, 2013

Semantic Mediawiki Update

Way back in 2008, I wrote my first posts on Semantic Wikishow I chose Semantic Mediawiki, and even got into a discussion with Jack about their value.

When I came to ProModel last year, I was able to have Semantic Mediawiki installed and started using it as a knowledge repository for the development team.  I store the following kinds of information in the wiki:

  1. Background information about the customer, their requirements, business rules, their business and their field.  Particularly useful is an encyclopedia of all of the terms that are used in our application and in design discussions.  This is what Mediawiki was built for and what it excels at.
  2. Notes from meetings, usability research and customer feedback.  These tend to be semi-structured and used as references for designs.  The "Semantic" extension of Mediawiki allows me to mark up my notes with category and property tags so that they can be organized and found later when I need them.
  3. Finally, I write the complete user interface specification in the wiki.  The wiki gives me a quick and easy way to publish on our intranet and share with the customer.  The semantic tags allow me to slice and dice the specification for analysis (e.g., show me all the screens that have an "Export" button) and integration with my cool automated testing rig (see my earlier posts on Sikuli).  A nice benefit of driving your testing directly from the specification is that it forces the specification to be up-to-date (or the tests fail).
Benefits of Semantic Mediawiki:
  1. It's free and open source
  2. It's very reliable
  3. Tons of free and open source extensions
  4. Quick and easy way to publish online with semantic tags to for retrieval, analysis and automation
Drawbacks of Semantic Mediawiki:
  1. Like all open source projects, you need to be somewhat technically savvy to install, configure the product and its extensions.
  2. The Semantic Mediawiki extension is not perfectly compatible with some other useful MediaWiki extensions
  3. Not everyone is comfortable writing on a wiki.  Adoption can be sporadic.
Overall, I'm happy with the product, but I would consider other options that may have the same benefits with less of the drawbacks.  However, many of the best options are not free and charge a monthly, per user fee.  For example, I have used the Socialtext and Sharepoint wikis previously and they seem to be more usable (in some ways), but less semantic and much more expensive.

Tuesday, April 2, 2013

The Visual Style Guide


Following up on my earlier post about the UX Team, I'm currently collaborating with a graphic designer at work to put together a visual style guide for my web-based application.  It's intended to be a reference for our developers, showing how all of the visual elements work together and supporting consistency throughout the application. A visual style guide often includes specifications (with examples) for the use of logos, typography, color, layout, and navigation.


It has been a while since I sampled the internet for examples of this type of documentation, so I spent some time this morning googling around and found the following resources:

  • Yale Visual Identity: This is a classic example of a visual standard that typically comes from a marketing department and specifies how to use a logo, color and typography in printed products and online to best communicate a particular brand.
  • SAP Interaction Design Guide: A comprehensive guide that includes logo, typography, layout, navigation and much more.  The guide seems a bit dated, but is an excellent example of the type of content that you'd like to see in a full featured style guide.
  • BBC Global Experience Language: An extensive reference similar to the SAP guide above.  This guide also specifies a particular layout grid and examples of how the various visual elements layout on the grid.  Interestingly, I find the printed version to be much easier to use.
  • New School Web Style Guide: A good example of a much smaller, but still useful visual style reference.  You could produce something like this quickly and then iterate on it as needed.
  • UX Guidelines for Metro: Metro is a design language created by Microsoft used in Windows Phone and Windows 8.  The project I'm currently working on uses Microsoft Silverlight (but not Metro), so I'm interested in looking to see how this might apply.





Thursday, February 28, 2013

The User Experience Team

After researching and writing my last post on Strategic UX Design, I started thinking more about how I've spent my time at the various organizations that I've worked for.  Early in my career I spent a good bit of time as a UI developer and later as a researcher and project manager for UI-intensive projects.

If I map my experience onto the 5 Competencies of UX Design diagram from UX matters, I've heavily invested my time into Interaction Design and Prototype Engineering with quite a bit of Information Architecture recently, and scattered, intense periods of Usability Engineering and Visual Design.  In terms of UX Strategy, I've some time on it at the beginning of each project. However, it's often hard to maintain a focus on strategy within the context of the daily grind of software design and development.

It's not surprising to note that my most successful and productive periods of UX work have come while working within a good team.  When two people are working well together, it can more than double your productivity.  The same can be true (to a lesser extent) for larger teams if you have quality people and good leadership.

In some of my past projects, I've been lucky enough to work with trained and experienced UX professionals.  In other cases, I was able to recruit and develop folks with other backgrounds such as documentation, testing, UI software development, business analysis and graphic design.

In this post, I want to capture my model of the roles a good UX team. Often two or more of these roles are filled by a single person (based on resources available for UX design), but to the extent that the roles can be filled by quality specialists (perhaps borrowed from other groups in the organization), you can expect to see corresponding performance improvements.
  • UX Lead: (objectives, strategy, total customer experience, interface with other leads [dev, qa, marketing, sales, senior management?], UX process and integration, manage team, prioritization)
  • Business Analyst/Research: (understand and document business needs/rules/requirements, research and document information architecture and terminology, help develop value map, interface with customers, other business analysts, documentation team, marketing and sales)
  • Interaction and Visual Design: (UI logical framework, look and feel standards, contribute to UI specifications/prototypes for each UI element (screen, control, etc.), ensure designs meet measurable objectives, interface with marketing and dev)
  • Usability Testing/Research: (set up and run usability tests/research, document findings, track and document usability issues / feature requests, interface with QA team, overlap with UI functionality testing, ensure that UI conforms with specs and standards)
  • Prototype/UI Engineering: (develop UI framework and reusable components, rapid prototyping of new designs, help translate new designs into production code, interface with dev)
Note: I referenced the excellent resource, 5 Competencies of UX Design, as I was writing this up.  You should expect to see a good amount of overlap.

Not sure if this doodle adds anything to the post, but it was fun to draw. :-)

Wednesday, February 27, 2013

Strategic UX Design


Recently, I've been thinking quite a bit about strategic UX design. I'm sure there are many ways to think about this, but here's my cut:
  • Tactical UX Design is about using specific tools and techniques to design a particular user experience.  UX matters provides a nicely organized list of UX techniques and work products in their 5 Competencies of UX Design diagram (see below).

Nicely organized list of Tactical UX Design skills and work products from UX matters
  • Strategic UX Design is where an organization decides how to apply its limited resources to achieve some effect.  This should be a measurable effect (i.e., key performance indicator) such as an impact on sales, improved user satisfaction, reduced training, improved throughput/task completion, error reduction, etc.  I think Leisa Reichelt's diagram (see below) does a nice job of describing the relationship between business strategy, UX strategy and UX tactical execution.
A model of the relationship between business strategy and UX from Strategic UX.
From my perspective, the key UX strategy activities include the following:
  1. Setting the Scope and Objectives: It's worthwhile to spend a little time thinking about who the "Strategic UX Thinker" is reporting to (CEO, CTO, Project Manager, Product Manager, etc.), what their goals are, and how they and their goals relate to the rest of the organization.  UX is often housed in the development department, but it requires input and has an impact on other parts of the business including sales, marketing and other products.
  2. Understanding the Customers and their Contexts: This step identifies the "big picture" of the software experience.  What kind of situations are the users in when they are using the software?  How are they feeling and how do we want them to feel? What challenges are they facing and how can we help? Here is where you should be able to identify "big picture" issues and opportunities that a company can address.  Contextual Design, Personas, and Value Maps are three key techniques for capturing and communicating this information.
  3. Teamwork and Process: I've been in enough organizations to know that UX can work in just about any kind of team and software development process.  Key strategic UX issues include:
    1. How are the UX objectives prioritized relative to other product objectives and are there sufficient resources to meet the short-term and long-term UX objectives?
    2. What is the organizational structure and development process and how do they evolve over time to meet new needs?
    3. How is organizational knowledge of the customer captured, communicated and utilized within the organization?
Well, that's it for this post.  If you're interested in reading more, here's a nice list of other resources on the topic of Strategic UX Design:

Tuesday, February 26, 2013

A Simple Scala Internal DSL

My earlier posts have talked about why I'm using Scala to create a DSL for documenting user interface procedures.  I also introduced Sikuli and posted a simple Java program for capturing on-screen images and then later using the Sikuli vision library to find them on the screen and click them.

In this post, I will model a very simple Internal DSL in Scala that can execute this script:
Click the "OK".button;

As you can see, as is the case with internal DSLs, I couldn't quite get the syntax the way that I wanted, but it's pretty close.  A lesson learned is that you can use internal DSLs in situations where you aren't too picky about the syntax.  In my next post I will share an external DSL that has the exact syntax that I want (and some other benefits also).

Here's how to read the code (at the bottom of this post):

  • Imports
  • I create a Scala object (MehtodRunner) that extends App (basic Scala stuff)
  • I create the Sikuli screen object (main object for executing Sikuli commands)
  • I create the "Click" object (the first word in my syntax example above)
  • I define a "the" method (second word in my syntax) which takes in a single Component as a parameter.
    • When called, it creates an image filename and checks to see if the file exists or not.
    • If it exists, it asks Sikuli to find a matching image on the screen and click it.
    • If it does not exist, it asks Sikuli to capture the image and store it for the next time.  See my last post for a more in depth explanation of how this works.
  • The next section converts the "OK".button syntax into a Component object.  It's basic Scala Internal DSL stuff.  For more information on this topic, take a look at Designing Internal DSLs in Scala by Debasish Ghosh.
  • There is a section that does the work of capturing screenshots in Sikuli (called from "the" method above).  Again, see my last post for an explanation.
  • Finally, I embed the Click the "OK".button; script inline.  This is one of the benefits (and drawbacks) of an internal DSL.



----

import org.sikuli.script.ScreenHighlighter
import org.sikuli.script.Screen
import org.sikuli.script.Location
import org.sikuli.script.CapturePrompt
import org.sikuli.script.Observer
import org.sikuli.script.Subject
import java.io.File
import org.sikuli.script.Pattern
import javax.imageio.ImageIO

object MethodRunner extends App {

  // =============================
  // Sikuli screen object
  var screen = new Screen()

  // =============================
  // Script Actions  
  object Click {
    def the(in: Component) {
      println("Click the " + compDesc(in))

      var filename = safeFilename(in);
      println(filename)
      var file = new File(filename);

      if (file.exists()) {
        screen.click(new Pattern(filename));
      } else {
        captureComponent = in
        cp.prompt("I don't know what the '" + compDesc(in) +
          "' looks like.  Please select it.");
      }
    }
  }

  // =============================
  // this supports the syntax "OK".button
  implicit def toComponentBuilder(in: String) =
    new ComponentBuilder(in)
  class ComponentBuilder(name: String) {
    def button(): Component = {
      return new button(name);
    }
  }

  abstract class Component(val name: String)
  class button(name: String) extends Component(name)
  def compDesc(in: Component) = in.name + "." +
    in.getClass().getSimpleName()

  // =============================
  // capture and store screenshots for Sikuli 
  var cp = new CapturePrompt(screen)
  cp.addObserver(CaptureObserver)
  var captureComponent: Component = null

  var USER_HOME = System.getProperty("user.home");
  def safeFilename(comp: Component) = USER_HOME +
    "/Method/img/cache/" + comp.name.replaceAll("\\W+", "_") +
    "_" + comp.getClass().getSimpleName() + ".png";

  object CaptureObserver extends Observer {
    def update(s: Subject) {
      var img = cp.getSelection()
      ImageIO.write(img.getImage(), "png", 
          new File(safeFilename(captureComponent)));
      cp.close()
    }
  }

  // =============================
  // The Script

  Click the "OK".button;

}

Tuesday, February 12, 2013

Sikuli Trick: Capturing images with the Sikuli Java library

In my last post, I introduced Sikuli, talked about its capabilities and power, and posted a very small code sample that showed the basics of using the Sikuli Java library.

In that code sample, I showed how you could use Sikuli to load an image, search for it on the screen, and click it.  However, there is a big gap in that workflow -- if everything is based on images in Sikuli, where do all of those images come from?

Luckily, Sikuli provides a very nice capability to support the selection, capture, and storage of images from the screen.  If you use their UI, you get it for free.  It's also available in the Java library, but it's not well documented, so I'll post it here for future Googlers to find.

The interesting difference between the code below and the code from my last post happens when the image file does not exist (!imageFile.exists()).  The CapturePrompt.prompt() call freezes the screen and gives you a crosshair to make a selection.  This selection is then passed as a ScreenImage object to the update() callback.  I retrieve the Image object and save it as a file.  This happens the first time the program is run.  Each subsequent time the program is run, it will find the file and then click it.

In my next post, I'll tie this back to the Scala DSL that I posted about on Jan 30 and Feb 4 to make an English-like language for describing procedures that the computer can execute.

----


import java.io.File;
import java.io.IOException;

import javax.imageio.ImageIO;

import org.sikuli.script.CapturePrompt;
import org.sikuli.script.FindFailed;
import org.sikuli.script.Observer;
import org.sikuli.script.Screen;
import org.sikuli.script.ScreenImage;
import org.sikuli.script.Subject;

public class SikuliExample2 {

  public static void main(String[] args) {

    // Create a Sikuli Screen object.
    Screen screen = new Screen();

    // Specify an image that we'd like to click
    String USER_HOME = System.getProperty("user.home");
    String buttonImage = USER_HOME + "/Sikuli/OK_Button.png";
    final File imageFile = new File(buttonImage);

    // If the image file doesn't exist, screen grab 
    // Note, this only needs to be done the first time
    if (!imageFile.exists()) {
      final CapturePrompt cp = new CapturePrompt(screen);
      cp.prompt("'OK_Button.png' not found. Please select it.");
      cp.addObserver(new Observer() {
        public void update(Subject arg0) {
          ScreenImage img = cp.getSelection();
          try {
            ImageIO.write(img.getImage(), "png", imageFile);
          } catch (IOException e) {
            e.printStackTrace();
          }
          cp.close();
        }
      });
    } else {
      // Find the button and click it.
      try {
        screen.click(buttonImage);
      } catch (FindFailed e) {
        System.out.println("Couldn't find: "+buttonImage);
      }
    }
  }
}

Friday, February 8, 2013

Sikuli: An "On Screen" Computer Vision Library

In my last post, I briefly mentioned Sikuli as a tool that I've been using in combination with other technologies for usability analysis and automated testing.  However, before I get deeper into how I've been using it, this post will give my overview on what Sikuli is and why I think it is so useful.

If you go to the Sikuli website (http://www.sikuli.org/), you'll see Sikuli described as a tool for either: 1) running macros, or 2) automated software testing.  It can do both of these things quite well, but it's important to note that Sikuli is much deeper and more powerful than other tools that can do this.

From my perspective Sikuli consists of several parts:
  • Open CV: Sikuli utilizes the Open CV computer vision library. This is where the real power comes from.  Unlike traditional macro and software testing tools, Sikuli is based on vision.  If you can see it on the computer screen, so can Sikuli.  It's not perfect (text recognition is a weakness), but it can handle situations that other tools can't.  For example, you can write a Sikuli script to play a game that is written in Flash (or Silverlight, or HTML, or C++) even if it doesn't have an API.  As long as there are things to see and react to, Sikuli can be used.
  • CV Tuned to the Computer Screen: This isn't really a separate 'part' per se, but it's so important that it's worth it's own bullet.  Computer vision is such a broad and deep topic that an average joe that tried to download Open CV and use it for recognizing something on the computer screen would have a lot of learning and work to do.  The creators of Sikuli at the MIT User Interface Design Group did all of this work and bundled it into Sikuli for you.
  • Java API: The base implementation of Sikuli is written as a Java library.  You can bundle this into any Java application.  It's my preferred method of using Sikuli.
  • Custom UI and Jython: This layer is a nod towards usability for non-programmers.  There are some cool features here, but as a programmer, it doesn' really fit what I'm trying to do.
Below I've provided code that shows the simplest possible example of using the Sikuli Java API to do something.  Here's what it does:

  1. First, I provide the location of a screenshot of an OK button that I'm going to ask Sikuli to find and click for me.  In future versions of this program, I will check right here that the file really exists (and if it doesn't exist, I will have Sikuli help me create it).
  2. Next, I create an instance of the 'Screen' class.  It's the starting point for most of Sikuli's functionality.
  3. Finally, I have a try-catch block where I ask Sikuli to find something that looks like the "OK_button.png" on the screen and click it.  It might fail for two reasons: a) the file doesn't exist, or b) it exists, but there is nothing on the screen that looks like it.
This is a pretty simple program that doesn't really show the full power of Sikuli, but I wanted to lay out the basics before I start to get into some really cool stuff in my upcoming posts...

----

import org.sikuli.script.FindFailed;
import org.sikuli.script.Screen;


public class SikuliExample1 {

 public static void main(String[] args) {

  String USER_HOME = System.getProperty("user.home");
  String buttonImage = USER_HOME+"/Sikuli/OK_Button.png";

  Screen s = new Screen();

  try {
   s.click(buttonImage);
  } catch (FindFailed e) {
   System.out.println("Couldn't find image: "+buttonImage);
  }

 }

}

Monday, February 4, 2013

Documenting Use Cases / Tasks / Procedures

One of the fundamental ideas in the design of tools (software or otherwise) is that you must start with an understanding of the purpose of the tool -- 1) the people who will use it, 2) their goals, 3) the context, and 4) the process of use. In software and other complex systems, capturing, organizing and communicating this knowledge can be one of the most difficult parts of the work.

There are numerous techniques for gathering this data -- typically observation and interview-based -- drawn from job analysis and ethnography.  It's a great topic, but this article is about the next step, organizing, communicating and utilizing the data after you've gathered it.

In particular, this article is about notation.  Once you start to gather large amounts of data about your users, who they are and what they are doing -- how do you effectively document that data so that it can be understood and utilized for your analysis and design work?

Everyone who designs software has to grapple with this problem.  One simple approach used in extreme programming (and ad hoc programming) is to rely on one or more experts that have the knowledge (or can get it on demand) and can answer questions as they arise and provide quick feedback on designs as they develop.  More traditional software engineering utilizes formalisms such as UML.  Somewhere in between you have semi-formal methods such as Contextual Design.

If you do choose to write down information about the users and their tasks and goals, you can find a number of good references to help you decide exactly what to capture, but I think it boils down to a few basic elements based on the 5 Ws.
  • Who will be using your software? (actor)
  • What are they trying to do and why? (goal)
  • How do they do it? (procedure/method)
  • Where and when do they do it (context/selection rules)
As I described in an earlier post, I'm particularly interested in representing this data in Domain Specific Languages (DSLs) such as GOMSL. For the past several months I've been working on my own implementation of a DSL for documenting procedures.  I want my language to be as human-readable as possible, so I've based it on procedure writing standards described in the Microsoft Manual of Style for Technical Publications.

Basic Form
A procedure starts with a single line in the following pattern: "To :" and is followed by one or more steps. Each step starts with an asterisk(*) and ends with a period(.). A simple step has the following pattern:
* the "

Here's a small example use case:

To Open_About_Box:
   * Click the "DST-SM" icon.

There are more complex variations of a step to deal with things like looping, conditionals and calling of other methods, but this post is getting a bit long so I'll just end it here for now.  Stay tuned for some longer examples and some sample code that can parse this language and do some cool things like drive the UI and calculate some usability metrics.


Wednesday, January 30, 2013

Using Scala and Sikuli to create a DSL for analyzing and testing UI Methods

Ever since I first read about GOMS, I've been intrigued by the idea that you could write a description of a process that was both human readable and parseable by a computer.  A language like this could have multiple benefits:
  • The instructions to users could double as form a type of automated UI testing -- potentially giving you two benefits for the cost of one.
  • You could guarantee that the instructions that you provide to users were correct.  Incorrect instructions would fail to execute during your automated testing (which could be run after every build).  This would catch cases where the developers change the UI without informing the documentation, testing and/or specification teams.
  • The instructions could be analyzed using the GOMS methodology to give you free usability metrics (time to complete, learnability, some types of error prediction).
  • Other types of analysis related to the structure of the user interface (e.g., which screens and/or controls are most documented, how similar are the procedures used on different screens, etc.)
My interest in this area lead me to pursue my Masters in HCI at Carnegie Mellon with Bonnie John and later work with David Kieras while I was at Soar Technology.  This work was documented in the following papers:
  • Hudson, S. E., John, B. E., Knudsen, K., & Byrne, M. D. (1999). “A Tool for Creating Predictive Performance Models from User Interface Demonstrations.” Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 99-103. New York: ACM Press.
  • Kieras, D., and Knudsen, K. (2006). Comprehensive Computational GOMS Modeling with GLEAN. In Proceedings of BRIMS 2006, Baltimore, May 16-18.
Unfortunately, funding for that work was not continued, but as I started my User Experience Design work at ProModel -- including elements of UI analysis, specification, documentation and testing -- I was reminded again of the benefits of having a human and machine readable language for describing procedures.

For various reasons, I decided to take a fresh approach.
  • I designed my own Domain Specific Language (DSL) based on the guidelines found in the Microsoft Manual of Style for Technical Publications.
  • I tried various methods to encode my DSL.  I've been most happy with Scala so far and will have some posts about my experiments with both it's internal and external DSL capabilities.
  • I've also incorporated Sikuli (a computer vision-based testing library) to provide a mechanism for automatically executing my procedures.
Well, I guess this is long enough for an introductory post.  Stay tuned for a series of posts that document my prototypes and results thus far...