https://www.codemag.com/article/0810042
While general accessibility requirements (such as font colors in UI rendering) are important, programmatic access to the graphical user interface (GUI) is a crucial element to improving accessibility.
On the Windows® operating system, Microsoft® Active Accessibility® and User Interface (UI) Automation support this programmatic access. This article provides a quick overview of Windows Automation API 3.0 featured in Windows 7.
Windows Automation API 3.0: a Bit of History
Today, the Windows operating system offers two application programming interfaces (API) specifications for user interface accessibility and software test automation. The legacy API, Microsoft Active Accessibility, was introduced to Windows 95 as a platform add-on in 1996. The new API is a Windows implementation of the User Interface Automation specification called UI Automation. UI Automation was introduced in Windows Vista® and .NET Framework 3.0.
The ecosystem of Windows automation technologies, now called Windows Automation API, includes classic Microsoft Active Accessibility and Windows implementations of the UI Automation specification.
Although the two technologies are different, the basic design principles are similar. Both expose the UI object model as a tree hierarchy rooted at the desktop. Microsoft Active Accessibility represents individual UI elements as accessible objects, and UI Automation represents them as automation elements. Both refer to the accessibility tool or software automation program as the client. However, Microsoft Active Accessibility refers to the application or control offering the UI for accessibility as the server, while UI Automation refers to this as the provider.
Microsoft Active Accessibility offers a single COM interface with a fixed, small set of properties. UI Automation offers a richer set of properties, as well as a set of extended interfaces called control patterns to manipulate automation elements in ways Microsoft Active Accessibility cannot.
While UI Automation previously had both managed and unmanaged API for providers, the original release had no unmanaged interfaces for clients. With Windows Automation API 3.0, you can finally write UI Automation clients entirely in unmanaged code.
The new API also provides support for transitioning from Microsoft Active Accessibility servers to UI Automation providers. The IAccessibleEx interface enables legacy Microsoft Active Accessibility servers to add support for specific UI Automation patterns and properties without rewriting their whole implementation. The specification also allows in-process Microsoft Active Accessibility clients to access UI Automation provider interfaces directly, rather than through UI Automation client interface.
The ecosystem of Windows automation technologies, now called Windows Automation API, includes classic Microsoft Active Accessibility and Windows implementations of the UI Automation specification. At Microsoft, the UI Automation specification is implemented on Windows Vista, Windows Server 2008, Windows Presentation Foundation (WPF), XPS Viewers, and many other upcoming Microsoft products. Windows 7, Windows Internet Explorer 8, and Silverlight 2.0 are joining the pack soon.
The Architecture: Microsoft Active Accessibility, UI Automation, and Interoperability
The goal of Microsoft Active Accessibility is to expose basic information about custom controls such as control name, location on screen, and type of control, as well as state information such as visibility and enabled/disabled status. The UI is represented as a hierarchy of accessible objects; changes and actions are represented as WinEvents. The following components comprise the Microsoft Active Accessibility architecture:
- Accessible Object-A logical UI element (such as a button) that is represented by an IAccessible COM interface and an integer ChildID.
- WinEvents-An event system that enables servers to notify clients when an accessible object changes.
- OLEACC.dll-A run-time dynamic-link library that provides the Microsoft Active Accessibility API and the accessibility system framework.
For Microsoft Active Accessibility, the system component of the accessibility framework (OLEACC.dll) helps the communication between accessibility tools and applications (Figure 1). The applications (Microsoft Active Accessibility servers) provide UI accessibility information to tools (Microsoft Active Accessibility clients), which interact with the UI on behalf of users. The code boundary can be a programmatic or process boundary.
Figure 1: Microsoft Active Accessibility uses OLEACC.dll to communicate between clients, like screen readers, and servers, such as Windows applications.The goal of UI Automation is similar but broader, as described later in this article. From an architecture point of view, UI Automation loads the UI Automation Core component into both the accessibility tools’ and applications’ processes (Figure 2). This component manages cross-process communication and provides higher level services. This core component enables bulk fetching or caching of properties, which improves the cross-process performance over Microsoft Active Accessibility implementation.
Figure 2: User Interface Automation (UI Automation) uses the UI Automation Core to communicate between clients and providers and uses proxies to communicate with legacy implementations.Interoperability between Microsoft Active Accessibility-based and UI Automation-based Applications
The UIA-to-MSAA Bridge enables Microsoft Active Accessibility clients to access UI Automation providers by converting the UI Automation object model to a Microsoft Active Accessibility object model (Figure 3). Similarly, the MSAA-to-UIA Proxy (Figure 4) translates Microsoft Active Accessibility-based server object models for UI Automation clients.
Figure 3: UIA-to-MSAA Bridge enables Microsoft Active Accessibility clients to access UI Automation providers.Figure 4: The MSAA-to-UIA Proxy enables UI Automation clients to access Microsoft Active Accessibility servers.Now with IAccessibleEx, you can also improve existing Microsoft Active Accessibility server implementations by adding only required UI Automation object model information. The MSAA-to-UIA Proxy takes care of incorporating the added UI Automation object model.
Limitations of Microsoft Active Accessibility
Microsoft designed the Microsoft Active Accessibility object model about the same time as Windows 95 released. The model is based on “roles” defined a decade ago, and you cannot support new UI behaviors or merge two or more roles together. There is no text object model, for example, to help assistive technologies deal with complex Web content.
Another limitation involves navigating the object model. Microsoft Active Accessibility represents the UI as a hierarchy of accessible objects. Clients navigate from one accessible object to another using interfaces and methods available from the accessible object. Servers can expose the children of an accessible object with properties of the accessible object or with the IEnumVARIANT COM interface. Clients, however, must be able to deal with both approaches for any server. This ambiguity means extra work for client implementers, and the complexity can contribute unforeseen problems of server implementations.
Just as important is the inability to extend Microsoft Active Accessibility properties or functions without breaking or changing the IAccessible COM interface specification. The result is that you cannot expose new control behavior or property through the object model. The object model tends to be both static and stagnant.
UI Automation Specification
The UI Automation specification provides more flexible programmatic access to UI elements on the desktop, enabling assistive technology products such as screen readers to provide information about the UI to end users and to manipulate the UI by means other than standard input. The specification can be supported across platforms other than Microsoft Windows.
- The implementation of that specification in Windows is also called UI Automation. UI Automation is broader in scope than just an interface definition. UI Automation provides:
- An object model and functions that make it easy for client applications to receive events, retrieve property values, and manipulate UI elements.
- A core infrastructure for finding and fetching across process boundaries.
- A set of interfaces for providers to express tree structure and some general properties.
- A set of interfaces for providers to express other properties and functionality specific to the control type.
- To improve on Microsoft Active Accessibility, UI Automation aims to address the following goals:
- Enable efficient out-of-process clients, while continuing to allow in-process access.
- Expose more information about the UI in a way that allows clients to be out-of-process.
- Coexist with and leverage Microsoft Active Accessibility without inheriting its limitations.
- Provide an alternative to IAccessible that is simple to implement.
- The implementation of the UI Automation specification in Windows features COM-based interfaces and managed interfaces.
UI Automation Elements
UI Automation exposes every piece of the UI to client applications as an Automation Element. Providers supply property values for each element; invoking a method on an element invokes the corresponding method on the provider. Elements are exposed as a tree structure, with the desktop as the root element.
UI Automation offers much better performance for out-of-process client practices (400% or faster for some scenarios than Microsoft Active Accessibility running out-of-process), while adding richness and flexibility to support the latest user interface designs.
Automation elements expose common properties of the UI elements they represent. One of these properties is the control type, which describes its basic appearance and functionality (for example, a “button” or a “check box”).
UI Automation Tree
The Automation Tree represents the entire UI: the root element is the current desktop and child elements are application windows. Each of these child elements can contain elements representing menus, buttons, toolbars, and so on. These elements in turn can contain elements like list items, as shown in Figure 5.
Figure 5: This UI Automation tree represents all elements on the Run dialog.UI Automation providers for a particular control support navigation among the child elements of that control. However, providers are not concerned with navigation between these control subtrees. This is managed by the UI Automation core, using information from the default window providers.
To help clients process UI information more effectively, the framework supports alternative views of the automation tree: (1) Raw View, (2) Control View, and (3) Content View. As Table 1 shows, the type of filtering determines the views, and the client defines the scope of a view.
UI Automation Properties
The UI Automation specification defines two kinds of properties: (1) automation element properties and (2) control pattern properties. Automation element properties apply to most controls, providing fundamental information about the element such as its name. Examples are listed in Table 2. Control pattern properties apply to control patterns, described next.
Unlike with Microsoft Active Accessibility, every UI Automation property is identified by a GUID and a programmatic name, which makes new properties easier to introduce.
UI Automation Control Patterns
A control pattern describes the attributes and functionality of an automation element. For example, a simple “clickable” control like a button or hyperlink should support the Invoke control pattern to represent the “click” action.
Each control pattern is a canonical representation of possible UI features and functions, as shown in Table 3. There are 22 control patterns defined to date, some of which are new in Windows 7, and the Windows Automation API can support custom control patterns. Unlike with Microsoft Active Accessibility role or state properties, one automation element can support multiple UI Automation control patterns.
UI Automation Control Types
A control type is another automation element property that specifies a well-known control that the element represents. Currently, UI Automation defines 38 control types, including; Button, Check Box, Combo Box, Data Grid, Document, Hyperlink, Image, ToolTip, Tree, and Window.
Before you can assign a control type to an element, the element needs to meet certain conditions, including a particular automation tree structure, property values, control patterns, and events. However, you are not limited to these. You can extend a control with custom patterns and properties as well as with the pre-defined ones.
The total number of pre-defined control types is significantly lower than Microsoft Active Accessibility accRole definitions, because you can combine UI Automation control patterns to express a larger set of features while Microsoft Active Accessibility roles cannot. You can also customize the description of control type by LocalizedControlType property while keeping the baseline type as defined.
UI Automation Events
UI Automation events notify applications of changes to and actions taken with automation elements. The four different types of UI Automation events, as listed in Table 4, do not necessarily mean that the visual state of the UI has changed. The UI Automation event model is independent of the WinEvent framework in Windows, although the Windows Automation API can make UI Automation events interoperable with the Microsoft Active Accessibility framework.
IAccessibleEx Interface
The IAccessibleEx interface helps existing applications or UI libraries extend their Microsoft Active Accessibility object model to support UI Automation without rewriting everything from scratch. With IAccessibleEx, you can implement only the differences between Microsoft Active Accessibility and UI Automation object models.
Because the MSAA-to-UIA Proxy translates the object models of IAccessibleEx-enabled Microsoft Active Accessibility servers as UI Automation object models, UI Automation clients don’t have to do any extra work. The IAccessibleEx interface can enable classic in-process Microsoft Active Accessibility clients to interact directly with UI Automation providers, too.
Which to Support: Microsoft Active Accessibility, UI Automation, or IAccessibleEx?
While IAccessibleEx can be a cost effective way of supporting UI Automation, a couple of technical considerations should be made prior to the decision.
From a UI Automation client’s perspective, there is no difference between UI Automation providers and Microsoft Active Accessibility servers that implement IAccessibleEx correctly.
If you are developing a new application or control, I recommend UI Automation as it provides the best flexibility. Microsoft Active Accessibility may look simpler in the short term, but there are a lot of serious limitations to overcome, like its Windows 95-inspired object model and the inability to support new UI behaviors or merge roles.
These shortcomings surface quickly when you try to introduce new controls.
The UI Automation object model is based on canonical functionalities of UI features. Control developers choose from abstracted features (control patterns) that best suit their controls’ behaviors. This is supplemented with control types that offer role information to users.
When to Use IAccessibleEx
Consider the following requirements before you use the IAccessibleEx interface for your application.
Rule #1: The baseline Microsoft Active Accessibility Server’s accessible object hierarchy must be clean.
IAccessibleEx cannot fix problems with existing accessible object hierarchies.
Rule #2: Your IAccessibleEx implementation must be compliant with both Microsoft Active Accessibility and UI Automation specifications.
Tools are available to validate compliance with both specifications.
If either of these baseline requirements is not met, you should consider implementing UI Automation natively. You can keep legacy Microsoft Active Accessibility server implementations for backward compatibility if it is necessary. From a UI Automation client’s perspective, there is no difference between UI Automation providers and Microsoft Active Accessibility servers that implement IAccessibleEx correctly.
Clients’ Golden Question: Microsoft Active Accessibility, UI Automation, or Something Else?
Microsoft Active Accessibility is a "chatty" architecture and is slow for clients that run out of process. To mitigate this, many accessibility tool programs chose to hook into and run in the target application process. While assistive technologies widely employ this in-process code practice, the risk and complexity are extremely high. Thus, you need an out-of-process solution with better performance and reliability.
UI Automation offers much better performance for out-of-process client practices (400% or faster for some scenarios than Microsoft Active Accessibility running out-of-process), while adding richness and flexibility to support the latest user interface designs.
Conclusions and Resources
Windows Automation API 3.0 in Windows 7 features the best of Microsoft Active Accessibility and the UI Automation specification, while providing a cleaner migration path from one to the other. The object model is easier to use and more flexible, the automation elements reflect the evolution of modern user interfaces, and developers can define custom UI Automation control patterns, properties, and events.
The UI Automation object model is based on canonical functionalities of UI features. Control developers choose from abstracted features (control patterns) that best suit their controls’ behaviors.
UI Automation Specifications are offered as cross-platform accessibility API under the Microsoft Open Specification Promise. For more information about the specification and about how Microsoft implements it, visit the MSDN Accessibility Developer Center (http://msdn.microsoft.com/accessibility/).
Automation Tree | Description |
---|---|
Raw View | The full tree of automation element objects for which the desktop is the root. |
Control View | A subset of the raw view that closely maps to the UI structure as the user perceives it. |
Content View | A subset of the control view that contains content most relevant to the user, like the values in a drop-down combo box. |
Table 2: Examples of UI Automation Element properties.
Property | Description |
---|---|
AutomationId | A string containing the UI Automation identifier (ID) for the automation element. The AutomationId property of an element is expected to be the same in any instance of the application, regardless of the local language. |
BoundingRectangle | The coordinates of the rectangle that completely encloses the element. The returned rectangle is expressed in physical screen coordinates. |
Name | A string for the text representation of the element. This string should always be consistent with the label text on screen. For example, the Name property must be “Browse…” for the button labeled “Browse…”. |
ControlType | A ControlType of the automation element, which defines characteristics of the UI element by well known UI control primitives such as button or check box. |
FrameworkId | A string for the name of the underlying UI framework. FrameworkId enables client applications to apply special cases to a particular UI framework. Examples of property values are "Win32", "WinForm", and "DirectUI". |
Table 3: Examples of UI Automation Control Patterns.
Control Pattern | Description |
---|---|
Dock | Used for controls that can be docked in a container, like toolbars or tool palettes. |
ExpandCollapse | Used for controls that can be expanded or collapsed, like the File menu. |
Grid | Used for controls that support grid functionality such as sizing and moving to a specified cell, without header information. The “large icon view” in Windows® Explorer is an example of a control that follows the Grid control pattern. |
GridItem | Used for controls within grids. For example, each cell in Explorer’s “details view” could follow the GridItem pattern. |
Invoke | Used for controls that can be invoked, such as a button. |
ItemContainer | Used for controls that hosts a number of children that may be virtualized. |
MultipleView | Used for controls that can switch between multiple representations of the same set of information, data, or children. For example, a list view control where data is available in thumbnail, tile, icon, list, or detail views. |
RangeValue | Used for controls that have a range of values. For example, a spinner control containing years might have a range of 1900 to 2010, while another spinner control presenting months would have a range of 1 to 12. |
Scroll | Used for controls that can scroll. |
Selection | Used for selection container controls. For example, list boxes and combo boxes. |
Table | Used for controls those have a grid as well as header information. For example, Microsoft Excel worksheets. |
Text | Used for edit controls and documents that expose textual information. |
Toggle | Used for controls where the state can be toggled. For example, check boxes and checkable menu items. |
Transform | Used for controls that can be resized, moved, and rotated. Typical uses for the Transform control pattern are in designers, forms, graphical editors, and drawing applications. |
VirtualizedItem | Used for a virtualized object in a container that supports the ItemContainer pattern. |
Window | Used for controls that provide fundamental window-based functionality within a traditional graphical user interface. |
Table 4: Types of UI Automation Events.
Event | Description |
---|---|
Property change | Raised when a property on an UI Automation element or control pattern changes. For example, if a client needs to monitor an application's check box control, it can register to listen for a property change event on the ToggleState property. When the check box control is checked or unchecked, the provider raises the event and the client can act as necessary. |
Element action | Raised when a change in the UI results from end user or programmatic activity. For example, clients can listen for the Invoked event when a user clicks a button when the InvokePattern is invoked programmatically. |
Structure change | Raised when the structure of the UI Automation tree changes. The structure changes when new UI items become visible, hidden, or removed on the desktop. |
General event | Raised when actions of global interest to the client occur, such as when the focus shifts from one element to another or when a window closes. |