http://social.msdn.microsoft.com/Forums/en-US/vststest/thread/df043823-ffcf-46a4-9e47-1c4b8854ca13
Troubleshooting Guide for Visual Studio Test Controller and Agent
This guide is to help troubleshoot connection issues between Visual Studio Test Controller and Agent as well as remote test execution issues. It gives an overview of main connection points used by Test Controller and Agent and walks through general troubleshooting steps. In the end it provides a list of common errors we have seen and ways to fix them, and a description of tools that can be useful for troubleshooting as well as how to obtain diagnostics information for test execution components.
We would like to use this guide as running document, please reply to this post to add your comments.
1. Who should read this
You should read this guide if:
- You experience a problem configuring remote Test Agent/Controller, such as:
- Remote test run fails for unclear reason. This fails for both remote execution and remote collection.
- Test Agent cannot connect to Test Controller.
- You want to get diagnostics information to report an issue in Agent/Controller to Microsoft.
- You want to understand what can potentially break Test Agent/Controller.
2. Remote Test Execution: connection points
The following diagram illustrates main connection points between Test Controller, Agent and Client. It outlines which ports are used for incoming and outgoing connections as well as security restrictions used on these ports.
The technology used to connect remote test execution components is .Net Remoting over Tcp ports. For incoming connections, by default, Test Controller uses Tcp port 6901 and Test Agent uses port 6910. The Client also needs to accept incoming connection in order to get test results from Controller, and, by default, it is using random port for that. For information on how to configure incoming ports, refer to the Tools section in Appendix. For outgoing connections random Tcp ports are used. For all incoming connections Test Controller authenticates calling party and checks that it belongs to specific security group.
All connectivity issues can be divided into 2 main groups: network issues and security/permission issues.
2.1. Network/Firewall issues (mainly implied by .Net Remoting technology):
- Controller :
- Listens on TCP port 6901 (can be configurable to use different port).
- Needs to be able to make outgoing connection to Agents and to the Client.
- Needs incoming “File and Printer sharing” connection open.
- Agent:
- Listens on TCP port 6910 (can be configurable to use different port).
- Needs to be able to make outgoing connection to Controller.
- Client:
- Needs to be able to accept incoming calls. Usually you would get Firewall notification when Controller tries to connect to Client 1<sup>st</sup> time. On Windows 2008 Server the notifications are disabled by default and you would need to manually add Firewall exception for Client program (devenv.exe, mstest.exe, mlm.exe) so that it can accept incoming connections.
- By default, random TCM port is used for incoming connections. If needed, the incoming port can be configured (see the Tools section in Appendix).
- Needs to be able to make outgoing connection to Controller.
2.2. Permissions
There are two scenarios which are different by how Test Controller is operating, and the permissions used by Controller differ depending on the scenario:
- Test Controller runs as standalone: physical environments (VS2008 or VS2010).
- Test Controller is connected to TFS server: virtual environments (VS2010 only).
2.2.1. Permissions: Test Controller not connected to TFS server:
- To run tests remotely, Client user must belong to either TeamTestControllerUsers, or TeamTestControllerAdmins, or Administrators local group on Controller machine.
- To manage Controller/Agent, Client user must belong to TeamTestControllerAdmins or Administrators local group on Controller machine.
- Agent service account must belong to either TeamTestAgentService or Administrators local group on Controller machine.
- Controller service account must belong to either TeamTestControllerUsers or Administrators local group on Controller machine.
- Service accounts with empty/no passwords are not supported.
2.2.1. Permissions: Test Controller not connected to TFS server:
Coming soon.
2.3. Connection Points: Summary
Review of the connections gives high level picture of what can fail in Test Controller/Agent connectivity. At this point you can already have a clear idea which requirement is not met for your specific scenario. Next section provides step-by-step troubleshooting.
3. Step-by-step troubleshooting
Let’s walk through general troubleshooting procedure for Test Controller/Agent connection issues. For simplicity we’ll do that in step-by-step manner.
Before following these steps you may take a look at Known Issues section in the Appendix to see if your issue is one of known common issues.
The troubleshooting is based on the key connection points and in essence involves making sure that:
- The services are up and running.
- Permissions are set up correctly.
- Network connectivity/Firewall issues.
There are two scenarios which are different by how Test Controller is operating, and troubleshooting steps differ depending on the scenario; hence we will consider each scenario separately:
- Test Controller runs as standalone: physical environments (VS2008 or VS2010).
- Test Controller is connected to TFS server: virtual environments (VS2010 only).
3.1. Step-by-step troubleshooting: VS2008 or VS2010 physical environments
Pre-requisites. Make sure you have necessary permissions.
- Depending on what you need to troubleshoot, you may need Administrator permissions on Agent and/or Controller machines.
Step 1. Make sure that the Controller is up and running and Client can connect to Controller.
- Use Visual Studio or Microsoft Test Manager (see Tools section above) to view Controller status.
- If you can’t connect to Controller, make sure that Controller service is running:
- On Controller machine (you can also do that remotely) re/start controller service (see Tools section in Appendix).
- (if you still can’t connect) On Controller machine make sure that it can accept incoming connections through Firewall
- Open port 6901 (or create exception for the service program/executable).
- Add Firewall Exception for File and Printer Sharing.
- (if you still can’t connect) make sure that the user you run the Client under has permissions to connect to Controller:
- On Controller machine, add Client user to the TeamTestControllerAdmins local group.
- (if you still can’t connect) On Client machine make sure that Firewall is not blocking incoming and outgoing connections:
- Make sure that there is Firewall exception for Client program (devenv.exe, mstest.exe, mlm.exe) so that it can accept incoming connections.
- Make sure that Firewall is not blocking outgoing connections.
- (if you still can’t connect)
- VS2010 only: the simplest at this time is to re-configure the Controller:
- On Controller machine log on as local Administrator, run the Test Controller Configuration Tool (see Tools section above) and re-configure the Controller.
- All steps should be successful.
- VS2010 only: the simplest at this time is to re-configure the Controller:
- (if you still can’t connect) Restart Controller service (see the Service Management commands section in Tools section above)
Step 2. Make sure that there is at least one Agent registered on Controller.
- Use Visual Studio (Manage Test Controllers dialog) or Microsoft Test Manager (see Tools section in the Appendix) to view connected Agents.
- If there are no Agents on the Controller, connect the Agent(s).
- VS2010 only:
- On Agent machine log in as user that belongs to TeamTestAgentServiceAdmins.
- On Agent machine open command line and run the Test Agent Configuration Tool (see Tools section in the Appendix).
- Check ‘Register with Test Controller’, type controller machine name and click on ‘Apply Settings’.
- VS2008 only:
- In Visual Studio (Manage Test Controllers dialog) click on Add Agent.
- You may need to restart the Agent service.
- VS2010 only:
Step 3. Make sure that Agent is running and Ready (for each Agent)
Agent status can be one of: Ready/Offline (temporary excluded from Test Rig)/Not Responding/Running Tests.
- Use Visual Studio or Microsoft Test Manager (see Tools section in the Appendix) to check Agent status.
- If one of the Agents is not shown as Ready, make sure that Agent service is running:
- On Agent machine (you can also do that remotely) re/start Agent service (see Tools section in the Appendix).
- (if Agent is still not Ready)
- VS2010 only: the simplest at this time is to re-configure the Agent:
- On Agent machine log on as local Administrator and run the Test Agent Configuration Tool (see Tools section in the Appendix) and re-configure the Agent.
- All steps should be successful.
- VS2010 only: the simplest at this time is to re-configure the Agent:
- (if Agent is still not Ready)
- If Agent is shown as Offline, select it and click on the Online button.
- On Agent machine make sure that agent service can accept incoming connections on port 6901 (if Firewall in on, there must be Firewall exception either for the port or for the service program/executable).
- Make sure that Agent service account belongs to the TeamTestAgentService on the Controller.
- On Controller machine use Computer Management->Local Groups to add Agent user to the TeamTestAgentService group.
- Restart services: Stop Agent service/Stop Controller service/Start Controller service/Start Agent service.
- Make sure that Agent machine can reach Controller machine (use ping).
- Restart Agent service (see the Service Management commands section in Tools section above).
Step 4. If all above did not help, it is time now to analyze diagnostics information.
- (VS2010 only) Agent/Controller services by default log errors into Application Event Log (see Tools section in the Appendix).
- Check for suspicious log entries there.
- Enable tracing – see Diagnostics section above.
- Get trace for the components involved in your scenario, some/all of:
- Controller
- Agent
- Client
- Test Agent/Controller Configuration Tool
- Make sure that Controller/Agent service accounts have write access to trace files.
- Check for entries starting with “[E”.
- Get trace for the components involved in your scenario, some/all of:
Step 5. Take a look at Known Issues section in the Appendix to see if your issue is similar to one of those.
Step 6. Collect appropriate diagnostics information and send to Microsoft (create Team Test Forum post or Microsoft Connect bug).
3.2. Troubleshooting Guide: TFS/virtual environments (VS2010 only)
Coming soon.
3.3. Step-by-Step Troubleshooting: Summary
This concludes the troubleshooting steps. If this guide was not helpful in resolving your issue, let us know.
4. References
The following is a list of useful information sources related to Test Agent/Controller troubleshooting.
- Troubleshooting Test Executionin MSDN.
- Troubleshooting Controllers, Agents and Rigs(VS2008) in MSDN.
- Installing and Configuring Visual Studio Agents(VS2010) in MSDN.
- Understanding Visual Studio Load Agent Controller (Load Test team blog).
- Troubleshooting errors in lab management (Team Lab blog).
- Visual Studio Team System – TestForum.
- Microsoft Connect – report bugs/suggestions.
Appendix 1. Tools
The following tools can be useful for remote execution/Agent/Controller troubleshooting:
- Visual Studio: Premium (VS2010 only), Team Test Edition (VS2008 only).
- Manage Test Controllers dialog (Main menu->Test->Manage Test Controllers): see status of Controller and all connected Agents, add/remove Agents to Controller, restart Agents/the whole test rig, bring Agents online/offline, configure Agent properties.
- Note: on VS2008 this dialog is called Administer Test Controllers.
- Run tests remotely:
- VS2008: update Test Run Configuration to enable remote execution (Main Menu->Test->Edit Test Run Configurations->(select run config)->Controller and Agent->Remote->provide Test Controller name), then run a test.
- VS2010: update Test Settings to use remote execution role (Main Menu->Test->Edit Test Settings -> (select test settings)->Roles->Remote Execution), then run a test.
- Microsoft Test Manager (VS2010 only)
- Lab Center->Controllers: see status of Controller and all connected Agents, add/remove Agents to Controller, restart Agents/the whole test rig, bring Agents online/offline, configure Agent properties. Note that Lab Center only shows controllers that are associated with this instance of TFS.
- Test Controller Configuration Tool (TestControllerConfigUI.exe, VS2010 only):
- It is run as last step of Test Controller setup.
- You can use it any time after setup to re-configure Controller. The tool has embedded diagnostics which makes it easier to detect issues.
- The tool is located by default in C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE.
- Test Agent Configuration Tool (TestAgentConfigUI.exe, VS2010 only):
- It is run as last step of Test Agent setup.
- You can use it any time after setup to re-configure Agent. The tool has embedded diagnostics which makes it easier to detect issues.
- The tool is located by default in C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE.
- Diagnostics information
- Both Agent and Controller can be configured to trace diagnostics information (from errors to verbose) to Application Event Log or trace file. Clients can also be configured to trace (from errors to verbose) to trace file.
- Tracing can be enabled via .config file or registry (VS2010 only), registry wins. Choose the method that is more convenient for your scenario.
- Enable tracing via .config file(s):
- One of the advantages of using config files is that you can enable tracing for each component independently and using trace settings specific only to this component.
- For Controller Service/Agent Service/Agent Process, you need the following sections in the corresponding .config file (qtcontroller.exe.config, qtagentservice.exe.config, qtagent.exe.config, qtagent32.exe.config which by default are located in C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE):
- Inside the <appSettings>section:
<add key="CreateTraceListener" value="yes"/>
- Inside the <configuration>section (note: “Verbose” is equivalent to “4”):
<system.diagnostics>
<switches> <add name="EqtTraceLevel" value="Verbose" /> </switches> </system.diagnostics> - Trace files (by default these are created in the same directory as where controller/agent service/process is located, C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE):
- Controller: vsttcontroller.log
- Agent Service: vsttagent.log
- Agent Process: VSTTAgentProcess.log
- Important: please make sure that the location is writable by controller/agent service/process.
- Inside the <appSettings>section:
- For Client, add the following section to appropriate .config file (devenv.exe.config, mstest.exe.config, mlm.exe.config):
- Inside the <configuration>section (note: “Verbose” is equivalent to “4”):
<system.diagnostics>
<trace autoflush="true" indentsize="4"> <listeners>
<add name="EqtListener" type="System.Diagnostics.TextWriterTraceListener" initializeData="C:\EqtTrace.log" /> </listeners> </trace> <switches> <add name="EqtTraceLevel" value="Verbose" /> </switches>
</system.diagnostics> - Trace file: trace will go to the file specified by the initializeData attribute.
- Important: please make sure that the location is writable for the user you run devenv/mstest under.
- Inside the <configuration>section (note: “Verbose” is equivalent to “4”):
- Enable tracing via registry (VS2010 only):
- One of the advantages of using registry is that you can enable tracing for all components using just one setting, you don't have to modify multiple configuration files.
- Create a file with the following content, rename it so that it has .reg extension and double click on it in Windows Explorer:
Windows Registry Editor Version 5.00 [HKEY_CURRENT_USER\Software\Microsoft\VisualStudio\10.0\EnterpriseTools\QualityTools\Diagnostics] "EnableTracing"=dword:00000001
"TraceLevel"=dword:00000004 "LogsDirectory"="C:\" - Notes:
- In case of Test Controller/Agent services the HKEY_CURRENT_USER is the registry of the user the services are running under.
- TraceLevel: 0/1/2/3/4 = Off/Error/Warning/Info/Verbose.
- LogsDirectory is optional. If that is not specified, %TEMP% will be used.
- Trace file name is <Process name>.EqtTrace.log, e.g. devenv.EqtTrace.log.
- Tracing from Test Controller Configuration Tool and Test Agent Configuration Tool:
- To get trace file, click on Apply, then in the “Configuration Summary” window on the view log hyperlink in the bottom.
- SysInternals’ DebugView can also be used to catch diagnostics information.
- Application configuration files
- Controller, Agent and Client use settings from application configuration files:
- Controller service: qtcontroller.exe.config
- Agent service: qtagentservice.exe.config
- Agent process: qtagent.exe.config (neutral/64bit agent), qtagent32.exe.config (32bit agent).
- VS: Devenv.exe.config.
- Command line test runner: mstest.exe.config.
- By default these files are located in C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE.
- How to configure listening ports:
- This may be useful in the following scenarios:
- Default ports used by Controller/Agent/Client can be used by some other software.
- There is firewall between controller and client. In this case you would need to know which port to enable in the firewall so that Controller can send results to the Client.
- Controller Service: qtcontroller.exe.config:
<appSettings><add key="ControllerServicePort" value="6901"/></appSettings>
- Agent Service:
<appSettings><add key="AgentServicePort" value="6910"/></appSettings>
- Client: add the following registry values (DWORD). The Client will use one of the ports from this range for receiving data from Controller:
HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\VisualStudio\10.0\EnterpriseTools\QualityTools\ListenPortRange\PortRangeStart HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\VisualStudio\10.0\EnterpriseTools\QualityTools\ListenPortRange\PortRangeEnd
- This may be useful in the following scenarios:
- Controller, Agent and Client use settings from application configuration files:
- Service Management commands
- UI: Start->Computer->Right-click->Manage-> Services and Applications->Services
- § Visual Studio Test Controller
- Visual Studio Test Agent
- Command line: net start/net stop: use to start/stop Agent/Controller
- net start vsttcontroller
- net start vsttagent
- UI: Start->Computer->Right-click->Manage-> Services and Applications->Services
- Windows Firewall
- Start->Control Panel->Windows Firewall.
- IP Security Policy
- Start->Run->rsop.msc (on both Agent and Controller machines)
- Go to Computer configuration->windows settings->security settings->ip security policies
- Check if there are any policies that may prevent connections. By default there are no policies at all.
- Computer Management
- Local Groups
- Start->Computer->Manage->Local Users and Groups->Groups.
- Event Log (Application)
- Start->Computer->Manage->Event Viewer->Windows Logs->Application.
- Local Groups
- Ping
- You can use ping to make sure that general TCP/IP network connectivity works.
- Telnet
- You can use telnet to check that you can connect to Agent/Controller, i.e. Firewall is not blocking, etc.
- telnet <ControllerMachineName> 6901
- telnet <AgentMachineName> 6910
- You can use telnet to check that you can connect to Agent/Controller, i.e. Firewall is not blocking, etc.
- Visual Studio Team System – TestForum.
- Microsoft Connect – report bugs/suggestions.
Appendix 2. Known issues
The following is a list of known issues and suggested resolutions for them.
2.1. The message or signature supplied for verification has been altered (KB968389)
Symptom: Agent cannot connect to Controller.
Affected scenarios: Windows XP/Windows 7 connecting to Windows 2003 Server.
Additional information:
- EventL Log (Agent): The message or signature supplied for verification has been altered.
- Trace file (Agent) contains:
I, <process id>, <thread id>, <date>, <time>, <machine name>\QTAgentService.exe, AgentService: The message or signature supplied for verification has been altered. I, <process id>, <thread id>, <date>, <time>, <machine name>\QTAgentService.exe, AgentService: Failed to connect to controller. Microsoft.VisualStudio.TestTools.Exceptions.EqtException: The agent can connect to the controller but the controller cannot connect to the agent because of following reason: An error occurred while processing the request on the server: System.IO.IOException: The write operation failed, see inner exception. ---> System.ComponentModel.Win32Exception: The message or signature supplied for verification has been altered at System.Net.NTAuthentication.DecryptNtlm(Byte[] payload, Int32 offset, Int32 count, Int32& newOffset, UInt32 expectedSeqNumber) at System.Net.NTAuthentication.Decrypt(Byte[] payload, Int32 offset, Int32 count, Int32& newOffset, UInt32 expectedSeqNumber) at System.Net.Security.NegoState.DecryptData(Byte[] buffer, Int32 offset, Int32 count, Int32& newOffset) at System.Net.Security.NegotiateStream.ProcessFrameBody(Int32 readBytes, Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest) at System.Net.Security.NegotiateStream.ReadCallback(AsyncProtocolRequest asyncRequest) --- End of inner exception stack trace --- at System.Net.Security.NegotiateStream.EndRead(IAsyncResult asyncResult) at System.Runtime.Remoting.Channels.SocketHandler.BeginReadMessageCallback(IAsyncResult ar) Server stack trace: at Microsoft.VisualStudio.TestTools.Controller.AgentMachine.VerifyAgentConnection(Int32 timeout)
Root cause: You installed KB968389 either via Windows Update or manually.
Resolution: uninstall KB968389 from Start->Control Panel->Programs and Features->View Installed Updates.
2.2. Controller/Agent in untrusted Windows domains or one is in a workgroup and another one is in domain.
Symptom: Agent cannot connect to Controller.
Affected scenarios: Test Controller and Agent are not in the same Windows domain. They are either in untrusted domains or one of them is in a domain and another one is in a workgroup.
Additional information:
- Trace file (Agent) contains:
W, <process is>, <thread id>, <date>, <time>, <mMachine name>\QTController.exe, Exception pinging agent <agent name>: System.Security.Authentication.AuthenticationException: Authentication failed on the remote side (the stream might still be available for additional authentication attempts). ---> System.ComponentModel.Win32Exception: No authority could be contacted for authentication Server stack trace: at System.Net.Security.NegoState.ProcessReceivedBlob(Byte[] message, LazyAsyncResult lazyResult) at System.Net.Security.NegotiateStream.AuthenticateAsClient(NetworkCredential credential, ChannelBinding binding, String targetName, ProtectionLevel requiredProtectionLevel, TokenImpersonationLevel allowedImpersonationLevel)
at System.Net.Security.NegotiateStream.AuthenticateAsClient(NetworkCredential credential, String targetName, ProtectionLevel requiredProtectionLevel, TokenImpersonationLevel allowedImpersonationLevel) at System.Runtime.Remoting.Channels.Tcp.TcpClientTransportSink.CreateAuthenticatedStream(Stream netStream, String machinePortAndSid) at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg)
Root cause: Due to Windows security, Agent cannot authenticate to Controller, or vice versa.
Resolution:
- The simplest is to use Workgroup authentication mode:
- Mirror user account on Controller and Agent: create a user account with same user name and password on both Controller and Agent machine.
- Use mirrored user account to run Controller and Agent services under this account.
- If you are using VS2010 RC+ version (i.e. RC or RTM but not Beta2), add the following line to the qtcontroller.exe.config file under the <appSettings>node:
<add key="AgentImpersonationEnabled" value="no"/>
- Restart Controller/Agent services (see Tools section in the Appendix).
- Make sure there is no IP Security Policy that prevents the connection (see IP Security Policy under Tools section in the Appendix).
- By default for domain machines Windows uses domain (Kerberos) authentication, but if it fails it will fall back to workgroup (NTLM) authentication. This behavior can be and often is altered by IP Security policies, for instance, there could be a policy to block connections from machines which do not belong to the domain.
- Restart or re-configure Controller and Agent.