OpsMgr 2007 R2 now supports SQL 2008 R2 SP1

October 11, 2011, 7:19 pm

≫ Next: KB: Configuration may not update in System Center Operations Manager 2007

≪ Previous: What Gets Monitored with System Center Operations Manager 2012 Network Monitoring

OM Community,

System Center Operations Manager 2007 R2 now supports SQL Server 2008 R2 SP1. Note: We will have the Supported Configuration posted in the next few weeks to make this more official, but feel free to go ahead and install it.

Thanks!

Nishtha Soni | Program Manager

↧

KB: Configuration may not update in System Center Operations Manager 2007

February 29, 2012, 10:34 am

≫ Next: KB: Configuring the SharePoint 2010 Management Pack for System Center Operations Manager

≪ Previous: OpsMgr 2007 R2 now supports SQL 2008 R2 SP1

Here’s a new Knowledge Base article we published today. This one talks about troubleshooting an issue where configuration doesn’t update in System Center Operations Manager 2007:

=====

Symptoms

You may experience one or more of the following symptoms in a System Center Operations Manager 2007 Management Group:

Newly installed agents display as "Not Monitored" in the Operations Console, yet existing agents are monitored.
One or more monitors on one or more agents may not change state when healthy or unhealthy conditions are met.
Agents show as being in maintenance mode in the Operations Console, yet the workflows are not actually unloaded by the System Center Management service on the monitored computer.
Configuration changes, new rules or monitors, or overrides are not applied to some agents.
The Operations Manager event log on one or more agents will display event 21026, indicating that the current configuration is still valid, even though the configuration for these agents should have been updated.
The file "OpsMgrConnector.Config.xml" in the management group folder under "Health Service State"\"Connector Configuration Cache" does not update for long periods of time relative to the rest of the management group on one or more agents.

In addition, the Operations Manager event log may display one or more event with an ID 29106 when the System Center Configuration Management service restarts. For example

Log Name: Operations Manager
Source: OpsMgr Config Service
Event ID: 29106
Level: Warning
Description:
The request to synchronize state for OpsMgr Health Service identified by "da4d36df-ce22-8930-e6d4-45b783e9fdb1" failed due to the following exception "System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.

Log Name: Operations Manager
Source: OpsMgr Config Service
Event ID: 29106
Level: Warning
Description:
The request to synchronize state for OpsMgr Health Service identified by "fc1c815b-c0c4-242d-ae27-30db4ef99b54" failed due to the following exception "Microsoft.EnterpriseManagement.Common.DataItemDoesNotExistException: TypedManagedEntityId = 'ac8f3d08-ee2a-ae21-0e46-19c3da794183' is deleted.

Collecting ETL logs against the Configuration Service at INF level might reveal lines similar to that below:

3326 [ConfigurationChangeSetProvider.UpdateQueryTimestampFromResults] [configurationchangesetprovider_cs595]( 000000000343A92F )Timestamp = 04/11/2074 08:57:09.        
3327 [DatabaseAccessor.NotifyOnChanges] [databasenotification_cs329]( 0000000002E4BD4E )Firing change notification.        
3328 [ConfigurationEngine.DatabaseHelper.OnConfigurationChange] [configurationengine_cs499]( 00000000023546E1 )IsIncremental=True, NumberOfChanges=0        
3329 [StateManager.CollectDirty] [statemanager_cs39]( 00000000035D75A8 )State=274cda45-6031-c0e2-3659-0072251f5655 is dirty        
< large number of additional GUIDS >        
3432 [StateManager.CollectDirty] [statemanager_cs39]( 00000000035D75A8 )State=6ec4fb2d-d1c1-72a8-32e6-fe26df42aba8 is dirty        
3433 [StateManager.CollectDirty] [statemanager_cs45]( 00000000035D75A8 )NumberOfDirtyStates=104
3434 [ConfigurationEngine.CommunicationHelper.NotifyDirtyStatesTask.Run] [configurationengine_cs869]Completed successfully        
3435 [DatabaseAccessor.GetPollingIntervalMillisecondsTimeSpan] [databaseaccessor_cs126]Database polling interval 0 milliseconds

Note the timestamp in line 3326 is set to 04/11/2074. If this appears in ETL logging, use the SQL queries in the "More Information" section to confirm the condition listed in the "cause" section exists.

Cause

The System Center Management Configuration service uses a timestamp to determine when new configuration data needs to be calculated for agents and management servers. If the system clock on an agent is faster than the system clock on the RMS, discovery data from this agent will set the timestamp for one or more managed instances hosted by that agent to the current agent system clock time. The System Center Configuration Management service will delay calculating configuration updates for the instances on that agent until the system clock on the RMS is current with the timestamp for that discovery data. If the agent system clock was significantly faster than RMS system time when discovery data was sent, or the agent continues to send data with a future timestamp, then it is possible that the management group would experience the symptoms listed above.

Setting the agent system clock time to match the RMS system clock time will not reset the timestamp for the existing discovery data and the issue will remain until the RMS system clock time exceeds the discovery data by the grooming interval, when the obsolete discovery data will be groomed normally.

Resolution

1) The system clocks for all agents and management servers in the management group must not significantly exceed the system clock on the RMS when submitting ANY data. If any agents or management servers have system clocks more than a few minutes faster than the RMS, they should be corrected first to avoid any additional data with future timestamps being added to the database.

2) The future timestamps for the discovery data that has already been submitted must be modified in the OperationsManager database to reflect the current time.

3) The System Center Configuration Management service and System Center Management service on the RMS must be restarted after both the above conditions are met.

More Information

1) Use the following three queries to determine if this condition exists. The queries must be run against the OperationsManager database. If the timestamp with the greatest value in the table is greater than the current time (in UTC format), then the condition exists.

Select GetUTCDate()as 'Current Time',       
MAX(TimeGeneratedOfLastSnapshot) as 'DiscoverySource Timestamp' from DiscoverySource 
Select GetUTCDate()as 'Current Time',       
MAX(timegenerated) as 'DiscoverySourceToTypedManagedEntity Timestamp' from DiscoverySourceToTypedManagedEntity
Select GetUTCDate()as 'Current Time',       
MAX(timegenerated) as 'DiscoverySourceToRelationship Timestamp' from DiscoverySourceToRelationship

2) The following three queries can be used to determine which computers may have submitted discovery data with a future timestamp. If the system clocks on these agents are not current, set them to current time before taking any additional action.

-- Find all computers with DiscoverySource Timestamp more than one day in future --        
Select DisplayName, *         
from BaseManagedEntity        
where BaseManagedEntityID in        
(select BaseManagedEntityId from BaseManagedEntity BME        
join DiscoverySource DS on DS.BoundManagedEntityId = BME.BaseManagedEntityId        
where DS.TimeGeneratedOfLastSnapshot > DATEADD (d, 1, GETUTCDATE())        
and FullName like 'Microsoft.Windows.Computer%')
-- Find all computers with DiscoverySourceToTypedManagedEntity Timestamp more than one day in future --        
Select DisplayName, *         
from BaseManagedEntity        
where BaseManagedEntityID in        
(select BaseManagedEntityId from BaseManagedEntity BME        
join DiscoverySourceToTypedManagedEntity DSTME on DSTME.TypedManagedEntityId = BME.BaseManagedEntityId        
where DSTME.TimeGenerated > DATEADD (d, 1, GETUTCDATE())        
and FullName like 'Microsoft.Windows.Computer%')
-- Find all computers with DiscoverySourceToRelationship Timestamp more than one day in future --         
Select DisplayName, *         
from BaseManagedEntity        
where BaseManagedEntityID in        
(select BaseManagedEntityId from BaseManagedEntity BME        
join DiscoverySource DS on DS.BoundManagedEntityId = BME.BaseManagedEntityId        
join DiscoverySourceToRelationship DSR on DSR.DiscoverySourceId = DS.DiscoverySourceId        
where DSR.TimeGenerated > DATEADD (d, 1, GETUTCDATE())        
and FullName like 'Microsoft.Windows.Computer%')

3) To correct the existing data, run the following commands against the affected tables.

Update DiscoverySource
Set TimeGeneratedOfLastSnapshot = GETUTCDATE()
where TimeGeneratedOfLastSnapshot > GETUTCDATE()
Update DiscoverySourceToTypedManagedEntity
Set TimeGenerated = GETUTCDATE()
where TimeGenerated > GETUTCDATE()
Update DiscoverySourceToRelationship
Set TimeGenerated = GETUTCDATE()
where TimeGenerated > GETUTCDATE()

4) The following query can be used to see what additional data has been submitted to the database with a timestamp in the future. The tables related to maintenance mode should have several rows, assuming there are agents currently in maintenance mode which is scheduled to end at some time. All other tables should have timestamps with the current time, or in the past.

/* */       
/* The following query will search all tables in the database */        
/* for columns with datetime datatypes. It will then return */        
/* the total number of rows in each table that have values */        
/* greater than the configured number of days from present. */        
/* Times are all in UTC format. The default increment is */        
/* 3 days, but can be adjusted as needed. */        
/* */        
DECLARE @tabname AS sysname;        
DECLARE @colname AS sysname;        
DECLARE @fcontin AS tinyint;        
DECLARE @query AS nvarchar(max);
CREATE TABLE #work       
(        
TableName sysname,        
ColumnName sysname,        
NumRows int,        
);
DECLARE cur_meta CURSOR FOR       
SELECT t.Name 'Table',        
c.Name 'Column'        
FROM sys.columns c        
INNER JOIN sys.tables t ON c.object_id = t.object_id        
INNER JOIN sys.types y ON c.system_type_id = y.system_type_id        
WHERE y.Name = 'datetime';
/* Change the increment in the DATEADD(dd,3,GETUTCDATE()) function */       
/* as needed from the default of +3 days from current time */         
OPEN cur_meta;        
SET @fcontin = 1;        
WHILE (@fcontin > 0)        
BEGIN        
FETCH cur_meta INTO @tabname, @colname;        
IF (@@FETCH_STATUS < 0)        
BREAK;        
PRINT 'Table = '+ @tabname + ', Column = ' + @colname;        
SET @query = 'SELECT ''' + @tabname        
+ ''', ''' + @colname        
+ ''', COUNT(*) FROM ' + QUOTENAME(@tabname)        
+ ' WHERE ' + QUOTENAME(@colname) + ' > DATEADD(dd,3,GETUTCDATE())';        
INSERT INTO #work        
EXECUTE ( @query );        
END        
CLOSE cur_meta;        
DEALLOCATE cur_meta;
SELECT *       
FROM #work        
ORDER BY 3 DESC;        
DROP TABLE #work;

=====

For the most current version of this article please see the following:

2635742 : Configuration may not update in System Center Operations Manager 2007

J.C. Hornbeck| System Center & Security Knowledge Engineer

Get the latest System Center news onFacebookandTwitter:

The Forefront Server Protection blog: http://blogs.technet.com/b/fss/
The Forefront Endpoint Security blog : http://blogs.technet.com/b/clientsecurity/
The Forefront Identity Manager blog : http://blogs.msdn.com/b/ms-identity-support/
The Forefront TMG blog: http://blogs.technet.com/b/isablog/
The Forefront UAG blog: http://blogs.technet.com/b/edgeaccessblog/

↧

KB: Configuring the SharePoint 2010 Management Pack for System Center Operations Manager

April 16, 2012, 12:25 pm

≫ Next: KB: How to troubleshoot Event ID 2115 in Operations Manager

≪ Previous: KB: Configuration may not update in System Center Operations Manager 2007

Here’s another new Knowledge Base article we published today. This one goes through some common tips and troubleshooting for the SharePoint 2010 Management Pack for System Center Operations Manager:

=====

Summary

This article describes the steps to troubleshoot or configure the SharePoint 2010 Management Pack for the following scenarios:

No Access to SharePoint Foundation 2010 and SharePoint Server 2010 Databases
Configuring the Run As Account Association
Unable to monitor multiple farms in local domain or remote domains
How to run the Configuration Task
Unable to run the "Configure SharePoint Management Pack" task in System Center 2012 Operations Manager
Unable to monitor SharePoint 2010 Databases
User Education - Isolating Discoveries
User Education - Sync Time Overrides
User Education - Adding Workflows to SharePoint Config file

Scenario 1 - No Access to SharePoint Foundation 2010 and SharePoint Server 2010 databases
Symptom:

Computers are populated in the "Unidentified Machines" view in the Operations Manager console under Monitoring -> SharePoint 2010 Products -> Unidentified Machines

Several views in the Console under Monitoring -> SharePoint 2010 Products are "blank" or "Not Monitored", these views include:
Administration - Not monitored
Content Databases - Blank
Diagram View - Not monitored
Events - Blank
Farms - Blank
Performance - Blank
Servers - Blank
Service Front Ends - Blank
Services - Blank
Shared Services - Blank
SPHA Rules - Blank
Web Applications - Blank

Resolution: Set the proper permissions on the SharePoint Foundation and SharePoint Server 2010 databases. Enable debug tracing to determine where errors may occur.

Required Permissions
The required permissions for the configured run as account on an individual SharePoint farm are:

Local admin on all SharePoint 2010 Front End and Application Servers
Local admin on all SQL machines that host SharePoint 2010 databases
Full Farm Administrator rights within SharePoint 2010
DBO for all SharePoint databases

NOTE All SharePoint Foundation 2010 and SharePoint Server 2010 databases created during initial setup require the above permissions.
Below is a list of some of the databases in SharePoint Foundation 2010 and SharePoint Server 2010 which require DBO permissions. This is not a complete list as it depends on your specific configuration.

Application_Registry_Service
Bdc_Service_DB
Managed Metadata Service
PerformancePoint Service Application
Search_Service_CrawlStoreDB
Search_Service_DB
Search_Service_Application_PropertyStoreDB
Secure_Store_Service_DB
SharePoint_Config
SharePoint_AdminContent
StateService
User Profile Services Application_ProfileDB
User Profile Services Application_SocialDB
User Profile Services Application_SyncDB
User Profile Services Application_ReportingDB
User Profile Services Application_StagingDB
WebAnalyticsServiceApplication_ReportingDB
WebAnalyticsServiceApplication_StagingDB
WordAutomationServices
WSS_Content
WSS_Logging

NOTE The initial installation account for SharePoint 2010 Foundation and SharePoint 2010 Server already has the necessary permissions required in all databases created during initial installation. It is recommended that you use this installation account to configure the SharePoint Foundation 2010 and SharePoint Server 2010 Management Packs. If requirements for security call for the creation of a new account for the management pack administration and discovery, take into account that you will have to duplicate the same permissions already granted to the SharePoint installation account.

NOTE For a clustered installation of the root management server the SharePointMP.config file must exist in the following directory

For x86 bit version of Windows Server Operating Systems %Program Files (x86)\System Center Management Packs\Microsoft SharePoint 2010 Products OpsMgr 2007 MP en-us

For 64-bit version of Windows Server 2008 or Windows Server 2008 R2 copy the SharePointMP.config file to the %Program Files\System Center Management Packs from the %ProgramFiles(x86)%\System Center Management Packs directory

To grant a new account full farm administrator rights:

Open SharePoint 2010 Central Administration.
On the left panel click on Security.
In the middle pane right below Users click on “Manage the Farm Administrators Group”.
If the account you initially installed SharePoint is not already there. Then add SharePoint Run As Account to the group.
Adding the Operations Manager SharePoint account - in the top left corner click on drop down arrow next to “New” and choose Add Users.
Click on the small book icon (browse).
Type in the name of the Operations Manager SharePoint Action Account.
Click on the search icon and wait until it returns the Operations Manager SharePoint Action Account.
Click on the Add button.
Click OK.

Enable Debug Tracing
Enabling Debug Tracing will enable debug trace on those agent computers that run Windows PowerShell script based discoveries and SPHA monitors. By default it is turned off. When it is enabled, the script based discoveries and monitors will write debug trace information to Event Log in Operations Manager channel on all agent computers, and all the debug trace events have an event ID of 0.
To enable debug tracing do the following:

In the Operations Console Select Monitoring.
Select SharePoint 2010 Products.
Select Administration view.
On the Actions panel, click the task named “Set DebugTrace for SharePoint Management Pack”. A Run Task window will popup.
To enable debug trace (the default option), click Run. To disable debug trace, click Override.
Set the Enabled parameter value to “False” in the popup dialog.
Click Override to close the dialog.
Click Run.
Wait for the task to finish in Task Status window, and then check the Task Output to ensure that the task completes successfully.
Click Close.

How to use debug tracing
Run the “Set DebugTrace For SharePoint Management Pack” task then rerun the Admin Task, and then go to Operations Manager Event channel on the server and check events with ID = 0. Look for the timestamp in the event log and then check the SharePoint ULS trace log to ensure that it is the case
For more information about the ULS trace log, see the SharePoint Foundation 2010 documentation on TechNet (http://technet.microsoft.com/en-us/sharepoint/ee263910.aspx ).

Configure the "More Secure Option"
The More Secure option will deliver the configured credentials only to the machines specified in this section. The credentials sent will be for the purpose of discovering and monitoring the SharePoint farms specified. The machines in this list should be the same machines specified in the SharePointMP.config file. The requirement is to have all distributed application components listed for each individual farm. This would include front end server and SQL servers that host the SharePoint databases or any component thereof.

To configure the More Secure Option do the following:

Option 1 - Create the Run As Account and configure

Open the Operations Manager Console.
Go to the Administration tab.
Expand the Security node.
Right-click Run As Accounts.
Select Create Run As Account and Click Next.
Set the Run As Account Type as "Windows", give it a Display Name and Click Next.
Enter in the Credentials for the Active Directory Domain User Account and Click Next.
Select "More Secure" option and add all of the servers that are part of the SharePoint farm. This will include all SharePoint Frontend, Application and SQL Servers for that SharePoint farm.
Click Create.

Option 2 - Configure an already existing account

Open the Operations Manager Console and navigate to the Administration Tab.
Expand the Run As Configuration node and highlight Accounts.
In the middle panel open an existing Run As Account from the middle pane under Type: Windows
right click the account and choose properties.
Click on the Distribution Tab.
Select "More Secure" option and add all of the servers that are part of the SharePoint farm. This will include all SharePoint Frontend, Application and SQL Servers for that SharePoint farm.
Click OK.

NOTE Distribution of Security ensures that all the servers that are part of the SP Farm are selected and included here. We recommend having one set of Operations Manager servers monitor only one SharePoint farm. We do not recommend having multi-homed agent computer (SharePoint servers that are monitored in multiple Operations Manager management groups).

Scenario 2 - Configuring the Run As Account Association
Symptoms

Several views in the Console under Monitoring -> SharePoint 2010 Products are "blank" or "Not Monitored", these views include:

Administration - Not monitored

Content Databases - Blank

Diagram View - Not monitored

Events - Blank

Farms - Blank

Performance - Blank

Servers - Blank

Service Front Ends - Blank

Services - Blank

Shared Services - Blank

SPHA Rules - Blank

Web Applications - Blank

The following error message can be seen when the Run As Account Association is not configured properly due to syntax
Example:

The Event Policy for the process started at 10:44:13 PM has detected errors in the output. The 'StdErr' policy expression: 
.+ 
matched the following output: 
Account OpsMgr SharePoint Action Account doesn't exist 
Failed to find RunAs account OpsMgr SharePoint Action Account 
Command executed: "C:\Windows\system32\cmd.exe" /c powershell.exe -NoLogo -NoProfile 
-Noninteractive "$ep = get-executionpolicy; if ($ep -gt 'RemoteSigned') {set-executionpolicy 
remotesigned} & '"C:\Program Files\System Center Operations Manager 2007\Health Service State\
Monitoring Host Temporary Files 32\9687\AdminTask.ps1"' 'SharePointMP.Config'" 
Working Directory: C:\Program Files\System Center Management Packs\ 
One or more workflows were affected by this. 
Workflow name: Microsoft.SharePoint.Foundation.2010.ConfigSharePoint 
Instance name: Microsoft SharePoint 2010 Farm Group 
Instance ID: {B7E9A5AF-62D1-CF79-0AE8-044AE7CECBD7} 
Management group: XXX 
Error Code: -2130771918 (Unknown error (0x80ff0032))

Machines that do not have SharePoint Foundation 2010 or SharePoint Server 2010 installed are discovered as SharePoint 2010 Servers.

Resolution - Configure the Run As Account association, configure the Machine Name association and configure the "More Secure Option".
Configure the "More Secure Option"

The More Secure option will deliver the configured credentials only to the machines specified in this section. The credentials sent will be for the purpose of discovering and monitoring the SharePoint farms specified. The machines in this list should be the same machines specified in the SharePointMP.config file. The requirement is to have all distributed application component listed for a specific farm. This would include front end server and SQL servers that host the SharePoint database.
To configure the More Secure Option do the following:

Option 1 - Create the Run As Account and configure

Open the Operations Manager Console.
Go to the Administration tab.
Expand the Security node.
Right-click Run As Accounts.
Select Create Run As Account and Click Next.
Set the Run As Account Type as "Windows", give it a Display Name and Click Next.
Enter in the Credentials for the Active Directory Domain User Account and Click Next.
Select "More Secure" option and add all of the servers that are part of the SharePoint farm. This will include all SharePoint Frontend, Application and SQL Servers for that SharePoint farm.
Click Create.

Option 2 - Configure an already existing account

Open the Operations Manager Console and navigate to the Administration Tab.
Expand the Run As Configuration node and highlight Accounts.
In the middle panel open an existing Run As Account from the middle pane under Type: Windows
right click the account and choose properties.
Click on the Distribution Tab.
Select "More Secure" option and add all of the servers that are part of the SharePoint farm. This will include all SharePoint Frontend, Application and SQL Servers for that SharePoint farm.
Click OK.

NOTE Distribution of Security ensures that all the servers that are part of the SharePoint Farm are selected and included here. We recommend having one set of Operations Manager servers monitor only one SharePoint farm. We do not recommend having multi-homed agent computer (SharePoint servers that are monitored in multiple Operations Manager management groups).

Configuring the Run As Account Association
The Run As Account needs to be associated within the SharePoint Management Pack config file. If not configured correctly you will not be able to discover the SharePoint Servers.

To configure the SharePointMP.config file:

Navigate to <drive>:\Program Files (x86)\System Center Management Packs\Microsoft SharePoint 2010 Products OpsMgr 2007 MP en-us
NOTE For 64-bit version of Windows Server 2008 or Windows Server 2008 R2 copy the SharePointMP.config file to the %Program Files\System Center Management Packs from the %ProgramFiles(x86)%\System Center Management Packs directory
Right click the SharePointMP.config file and choose edit
Locate the section as shown below
Example:
<Association Account="SharePoint Discovery/Monitoring Account" Type="Agent">
Change this section to reflect the “Display Name” of the Run Account you have previously configured as the Run As Account for the SharePoint farm.
Now this section should look like this
<Association Account="SPAdmin" Type="Agent">
or
<Association Account="Domain\SPAdmin" Type="Agent">

NOTE Do not confuse this with the actual active directory domain user account.

Configuration of Machine Names
Configure the machine name of all the servers that are part of the SharePoint Farm and match the "More Secure" section of the Run As Account used for the SharePoint 2010 farm.

NOTE To confirm this name run a hostname command from a command prompt on the servers either locally or remotely for each computer that is part of the farm.
To configure the SharePointMP.config file:

Navigate to <drive>:\Program Files (x86)\System Center Management Packs\Microsoft SharePoint 2010 Products OpsMgr 2007 MP en-us
NOTE For 64-bit version of Windows Server 2008 or Windows Server 2008 R2 copy the SharePointMp.config file to the %Program Files\System Center Management Packs from the %ProgramFiles(x86)%\System Center Management Packs directory.
NOTE For a clustered root management servers the same procedure must be performed on both nodes of the cluster.
Right click the SharePointMP.config file and choose edit.
Find the section as shown below
Example:
<Machine Name="" />
<Machine Name="" />
</Association>
Change this section to include the SharePoint Server names for example:
<Machine Name="SRV1" />
<Machine Name="SRV2" />
</Association>

Confirm the Run As Account has been configured
To confirm the Run As Account has been configured:

Open the Operations Manager event log.
Look for event ID 7026 - open this event - this should indicate that the run as account for the SharePoint MP has successfully logged on.

NOTE: An event ID 7000 in the Operations Manager event log indicates that the run as account for the SharePoint MP has failed to log on.

Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 7000
Task Category: Health Service
Level: Error
Keywords: Classic
User: N/A
Computer:
Description:
The Health Service could not log on the RunAs account contoso\spadmin for management group <MGName>. The error is Logon failure: unknown user name or bad password.(1326L). This will prevent the health service from monitoring or performing actions using this RunAs account
Additionally you may also see the following events
Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 7021
Task Category: Health Service
Level: Error
Keywords: Classic
User: N/A
Computer:
Description:
The Health Service was unable to validate any user accounts in management group <MGName>.

Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 7015
Task Category: Health Service
Level: Error
Keywords: Classic
User: N/A
Computer:
Description:
The Health Service cannot verify the future validity of the RunAs account contoso\spadmin for management group <MGName>. The error is Logon failure: unknown user name or bad password.(1326L).

Scenario 3 - Unable to monitor multiple farms in local domain or remote domains

Symptom:

Only one server farm is discovered as seen from the Monitoring -> SharePoint 2010 Products -> Farms state view. Servers for other farms show up in the Monitoring -> SharePoint 2010 Products -> Unidentified Machines state view

Resolution: Configure the SharePointMP.config to discover more than one servers farm

Required Permissions
The required permissions for each individual SharePoint farm run as account are:

Local admin on all SharePoint Front End and Application servers
Local admin on all SQL boxes that host SharePoint 2010 Databases
Full Farm Administrator rights within SharePoint 2010
DBO for all SharePoint databases

For a clustered installation of the root management server the SharePointMP.config file must exist in the same directory as described above in each individual node of the cluster.

Configure the "More Secure Option"
The More Secure option will deliver the configured credentials only to the machines specified in this section.
The credentials sent will be for the purpose of discovering and monitoring the SharePoint farms specified. The machines in this list should be the same machines specified in the SharePointMP.config file. The requirement is to have all distributed application component listed for a specific farm. This would include front end server and SQL servers that host the SharePoint database.

To configure the More Secure Option do the following:

Option 1 - Create the Run As Account and configure

Open the Operations Manager Console.
Go to the Administration tab.
Expand the Security node.
Right-click Run As Accounts.
Select Create Run As Account and Click Next.
Set the Run As Account Type as "Windows", give it a Display Name and Click Next.
Enter in the Credentials for the Active Directory Domain User Account and Click Next.
Select "More Secure" option and add all of the servers that are part of the SharePoint farm. This will include all SharePoint Frontend, Application and SQL Servers for that SharePoint farm.
Click Create.

Option 2 - Configure an already existing account

Open the Operations Manager Console and navigate to the Administration Tab.
Expand the Run As Configuration node and highlight Accounts.
In the middle panel open an existing Run As Account from the middle pane under Type: Windows
right click the account and choose properties.
Click on the Distribution Tab.
Select "More Secure" option and add all of the servers that are part of the SharePoint farm. This will include all SharePoint Frontend, Application and SQL Servers for that SharePoint farm.
Click OK.

Example Scenario: You have 3 farms residing in 2 different domains.

Contoso - SharePoint Farm Administrator 1 is associated with the farm administrator account for the first SharePoint farm in contoso.com domain and uses the Domain Account SPADMIN1

Contoso - SharePoint Farm Administrator 2 is associated with the farm administrator account for the second SharePoint farm in contoso.com domain and uses the Domain Account SPADMIN2
Fabrikam - SharePoint 2010 Farm Administrator is associated with the farm administrator account for the third SharePoint farm in fabrikam.com domain and uses the Domain Account FKSPADMIN

NOTE For the remote domain Fabrikam.com it is assumed that you have a reliable link using an Operations Management Gateway Server or a two way full trust for the domains.

Using the display name of the Run As Account in the Administration -> Run As Configuration -> Accounts -> Type: Windows
To configure the SharePointMP.config file:

Navigate to the:
Drive:\Program Files (x86)\System Center Management Packs\Microsoft SharePoint 2010 Products OpsMgr 2007 MP en-us
NOTE For 64-bit version of Windows Server 2008 or Windows Server 2008 R2 copy the SharePointMP.config file to the %Program Files\System Center Management Packs from the %ProgramFiles(x86)%\System Center Management Packs directory
NOTE For a clustered installation of the root management server the SharePointMP.config must exist in the same directory as described above in each individual node of the cluster.
Right click the SharePointMP.config file and choose edit.

Find the "Association" and "Machine Name" section in the SharePointMP.config file as shown in the example below

</Annotation>
  <Association Account="SharePoint Discovery/Monitoring Account" Type="Agent">
    <Machine Name="" />
  </Association>

Change the "Association" and "Machine Name" section to read as followed in this example:

<Association Account="Contoso - SharePoint Farm Administrator 1" Type="Agent"> 
<Machine Name="Contoso1" /> 
<Machine Name="Contoso2" /> 
<Machine Name="Contoso3" /> 
<Machine Name="Contoso4" /> 
<Machine Name="Contoso5" /> 
<Machine Name="Contoso6" /> 
</Association>

<Association Account="Contoso - SharePoint Farm Administrator 2" Type="Agent"> 
<Machine Name="Constosrv1" /> 
<Machine Name="Constosrv2" /> 
<Machine Name="Constosrv3" /> 
</Association>

<Association Account="Fabrikam - SharePoint 2010 Farm Administrator" Type="Agent"> 
<Machine Name="Fabrikam1" /> 
<Machine Name="Fabrikam2" /> 
<Machine Name=" Fabrikam3" /> 
</Association>

Confirm the Run As Account has been configured
To confirm the Run As Account has been configured:

Open the Operations Manager event log.
Look for event ID 7026 - open this event - this should indicate that the run as account for the SharePoint MP has successfully logged on.

NOTE An event ID 7000 in the Operations Manager event log indicates that the run as account for the SharePoint MP has failed to log on.

Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 7000
Task Category: Health Service
Level: Error
Keywords: Classic
User: N/A
Computer: SRV1.contoso.com
Description:
The Health Service could not log on the RunAs account contoso\spadmin for management group <MGNAME>. The error is Logon failure: unknown user name or bad password.(1326L). This will prevent the health service from monitoring or performing actions using this RunAs account
Additionally you may also see the following events

Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 7021
Task Category: Health Service
Level: Error
Keywords: Classic
User: N/A
Computer: SRV1.contoso.com
Description:
The Health Service was unable to validate any user accounts in management group <MGNAME>.

Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 7015
Task Category: Health Service
Level: Error
Keywords: Classic
User: N/A
Computer: SP2010SRV1.contoso.com
Description:
The Health Service cannot verify the future validity of the RunAs account contoso\spadmin for management group <MGNAME>. The error is Logon failure: unknown user name or bad password.(1326L).

Scenario4- How to Run the Configuration Task

Symptom: Unable to run configuration task the following error(s) are generated

Example 1

Exception calling ".ctor" with "1" argument(s): "The user Contoso\SPAdmin does not have sufficient permission to perform the operation." 
Failed to connect to local management group 
Command executed: "C:\Windows\system32\cmd.exe" /c powershell.exe -NoLogo -NoProfile -Noninteractive "$ep = get-executionpolicy; if 
($ep -gt 'RemoteSigned') {set-executionpolicy remotesigned} & '"C:\Program Files\System Center Operations Manager 2007\Health Service 
State\Monitoring Host Temporary Files 49\5037\AdminTask.ps1"' 'SharePointMP.Config'" 
Working Directory: C:\Program Files\System Center Management Packs\ 
One or more workflows were affected by this. 
Workflow name: Microsoft.SharePoint.Foundation.2010.ConfigSharePoint 
Instance name: Microsoft SharePoint 2010 Farm Group 
Instance ID: {B7E9A5AF-62D1-CF79-0AE8-044AE7CECBD7} 
Management group: XXX
Error Code: -2130771918 (Unknown error (0x80ff0032)).

Example 2

The Event Policy for the process started at 10:44:13 PM has detected errors in the output. The 'StdErr' policy expression: 
.+ 
matched the following output: 
Account OpsMgr SharePoint Action Account doesn't exist 
Failed to find RunAs account OpsMgr SharePoint Action Account 
Command executed: "C:\Windows\system32\cmd.exe" /c powershell.exe -NoLogo -NoProfile -Noninteractive "$ep = get-executionpolicy; if 
($ep -gt 'RemoteSigned') {set-executionpolicy remotesigned} & '"C:\Program Files\System Center Operations Manager 2007\Health Service 
State\Monitoring Host Temporary Files 32\9687\AdminTask.ps1"' 'SharePointMP.Config'" 
Working Directory: C:\Program Files\System Center Management Packs\ 
One or more workflows were affected by this. 
Workflow name: Microsoft.SharePoint.Foundation.2010.ConfigSharePoint 
Instance name: Microsoft SharePoint 2010 Farm Group 
Instance ID: {B7E9A5AF-62D1-CF79-0AE8-044AE7CECBD7} 
Management group: XXX 
Error Code: -2130771918 (Unknown error (0x80ff0032)).

Resolution: Add the Run As Account Operations Manager Administrator Role

To add the Run As Account being used to execute the task

Open the Operations Console.
Navigate to Administration.
Click on Security.
Click on User Roles.
Click on Operations Manager Administrators.
Add the account running the task as part of the Operations Manager Administrators role.
NOTEFor 64-bit version of Windows Server 2008 or Windows Server 2008 R2 copy the SharePointMP.config file to the %Program Files\System Center Management Packs from the %ProgramFiles(x86)%\System Center Management Packs directory
NOTE For a clustered installation of the root management server the SharePointMP.config must exist in the same directory as described above in each individual node of the cluster.

Configure SharePoint Management Pack Task

The admin task configures the management pack by ensuring the existence of an override management pack, associating 'RunAs' account(s) to servers, enabling proxy settings, and initiating discoveries.

To run the "Configure SharePoint Management Pack" task do the following

Open the Operations Manager Console.
Click on the Monitoring tab of the console.
Expand the SharePoint 2010 Products view.
Click on the Administration state view.
On the Actions Pane, under Microsoft SharePoint 2010 Farm Group Tasks, click on the Configure SharePoint Management Pack.
Select the appropriate task credentials (preferably the SharePoint Admin Run As Account you have previously setup)
Click Run and wait for the task to finish successfully.
Click Close

Example of successful task

Configure SharePoint Management Pack Task Description
This admin task configures the management pack by ensuring the existence of an override management pack, associating 'RunAs' account(s) to servers, enabling proxy settings, and initiating discoveries.
Status:Success
Scheduled Time:
Start Time:
Submitted By:CONTOSO\SPADMIN
Run As:
Run Location:
Target:
Target Type:Microsoft SharePoint 2010 Farm Group
Category:Operations
Task Output:
Output
Load configuration file SharePointMP.Config
Configure Microsoft.SharePoint.Foundation.2010 version 14.0.4744.1000
Found override management pack Microsoft.SharePoint.Foundation.2010.Override version 1.0.0.0
Change 'SyncTime' configuration override to 20:06 for Microsoft.SharePoint.Foundation.2010.WSSInstallation.Discovery
Microsoft.SharePoint.Foundation.2010.WSSInstallation.Discovery does not have configuration TimeoutSeconds
Change 'SyncTime' configuration override to 20:08 for Microsoft.SharePoint.Foundation.2010.SPFarm.Discovery
Change 'SyncTime' configuration override to 20:14 for Microsoft.SharePoint.Foundation.2010.SPService.Discovery
Change 'SyncTime' configuration override to 20:20 for Microsoft.SharePoint.Foundation.2010.SPSharedService.Discovery
Change 'SyncTime' configuration override to 20:26 for Microsoft.SharePoint.Foundation.2010.SPHARule.Discovery
Change 'SyncTime' configuration override to 20:32 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.Availability
Change 'SyncTime' configuration override to 20:32 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.Security
Change 'SyncTime' configuration override to 20:32 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.Performance
Change 'SyncTime' configuration override to 20:32 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.Configuration
Change 'SyncTime' configuration override to 20:32 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.Custom
Change 'SyncTime' configuration override to 20:38 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.SPServer.Availability
Change 'SyncTime' configuration override to 20:38 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.SPServer.Security
Change 'SyncTime' configuration override to 20:38 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.SPServer.Performance
Change 'SyncTime' configuration override to 20:38 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.SPServer.Configuration
Change 'SyncTime' configuration override to 20:38 for Microsoft.SharePoint.Foundation.2010.SPHARuleMonitor.SPServer.Custom
SharePoint management pack configuration completed successfully
Error
None
Exit Code: 0

Scenario 5 - Unable to run the "Configure SharePoint Management Pack" task in System Center Operations Manager 2012

Symptom: The following error message is generated

The Event Policy for the process started at 6:51:29 PM has detected errors in the output. The 'StdErr' policy expression: 
.+ 
matched the following output: 
Exception calling "ImportManagementPack" with "1" argument(s): "This method from the System Center Operations Manager 2007 
R2 SDK is not supported to work with System Center Operations Manager 2012. Please migrate to the System Center Operations Manager 2012 SDK." 
Failed to create override management pack Microsoft.SharePoint.Foundation.2010.Override 
Command executed: "C:\Windows\system32\cmd.exe" /c powershell.exe -NoLogo -NoProfile -Noninteractive "$ep = get-executionpolicy; 
if ($ep -gt 'RemoteSigned') {set-executionpolicy remotesigned} & '"E:\Program Files\System Center 2012\Operations Manager\Server\Health Service 
State\Monitoring Host Temporary Files 11\7481\AdminTask.ps1"' 'SharePointMP.Config'" 
Working Directory: C:\Program Files\System Center Management Packs\ 
One or more workflows were affected by this. 
Workflow name: Microsoft.SharePoint.Foundation.2010.ConfigSharePoint 
Instance name: Microsoft SharePoint 2010 Farm Group 
Instance ID: {B7E9A5AF-62D1-CF79-0AE8-044AE7CECBD7} 
Management group: XXX
Error Code: -2130771918 (Unknown error (0x80ff0032)).

Resolution: Call Microsoft Support to provide an updated version of the management pack compatible with the System Center Operations Manager 2012 SDK

Scenario 6 - Unable to monitor SharePoint 2010 Databases

Symptom:

Critical alerts are generated in the active alerts view under Monitoring -> SharePoint 2010 Products -> Active Alerts

SharePoint: Database Connection Failed Alert Description
Source: Configuration Database A critical incident has occurred where the connection to database Data Source=sp2010srv2;Initial Catalog=SharePoint_Config;Integrated Security=True;Enlist=False;Connect Timeout=15 failed.
Path: Configuration Database
Alert Monitor: SQL Database Connection Failed
Created:
Alert Context:
Date and Time
HRESULT -2147217805
Result Data Source could not be initialized
Error Message Format of the initialization string does not conform to the OLE DB specification.
Initialization Time 23
Open Time 0
Execution Time 0
Fetch Time 0
Result Set Input Data Item

The SQL Database Connection Failed monitors are showing critical under the following views

Monitoring -> SharePoint 2010 Products ->
Configuration Databases
Content Databases
Shared Services
Diagram View
Resolution: Create a new override for the connection string value on the SQL Database Connection Failed monitors
To create the Override needed do the following:

1. From the Monitoring -> SharePoint 2010 Products ->Active Alerts view select an affected monitor
2. Under the Alert Details (bottom pane) take notice of the Alert Description. It should look like this

Example:
Alert Description
A critical incident has occurred where the connection to database Data Source=sp2010srv2;Initial Catalog=SharePoint_Config;Integrated Security=True;Enlist=False;Connect Timeout=15 failed

3. Copy and Paste the text to a text editor such as notepad.exe
4. Right Click on the monitor once again and select View or edit the settings of this monitor
5. In the SQL database Connection Failed Properties windows select the overrides tab and click on the override button
6. Select the option "For a specific object of class: XXX"

Example:
For a specific object of class: SharePoint Configuration Database

7. In the Select Object under matching objects select the appropriate matching object and click OK

Example:
Configuration Database

8. Override ConnectionString parameter value from

Example:
Provider=SQLOLEDB;$Target/Property[Type="Microsoft.SharePoint.Foundation.2010.SPDatabase"]/ConnectionString$

Provider=SQLOLEDB;Data Source=SP2010srv2;Initial Catalog=SharePoint_Config;Integrated Security=SSPI;Enlist=False;Connect Timeout=15

9. Create an new override management pack or save to an existing override management pack other than the and save the changes by clicking OK

NOTE Since each individual database needs its own unique database string that corresponds to its database name (Initial Catalog), you will need to modify the previously copied connection string the alert description of the monitor and change Integrated Security from True

Examples:
Data Source=sp2010srv2;Initial Catalog=SharePoint_Config;Integrated Security=SSPI;Enlist=False;Connect Timeout=15
Data Source=sp2010srv2;Initial Catalog=SharePoint_AdminContent_0ada3e0b-a0f6-4af5-a311-34bcedb1c4eb;Integrated Security=True;Enlist=False;Connect Timeout=15
Data Source=sp2010srv2;Initial Catalog=WSS_Content;Integrated Security=SSPI;Enlist=False;Connect Timeout=15
Data Source=sp2010srv2;Initial Catalog=Bdc_Service_DB_17ab85413d424b84ac58ea247e7f5b47;Integrated Security=SSPI;Enlist=False;Connect Timeout=15
Data Source=sp2010srv2;Initial Catalog=Search_Service_Application_CrawlStoreDB_04e2a4bcdb974275954c0ab090d8a0aa;Integrated Security=SSPI;Enlist=False;Connect Timeout=15

User Education - Sync Time Overrides

We recommend using the defaults values in place for sync time. If the default values are not deemed appropriate for our environment take special considerations on the performance impact this may cause by changing these values.
SyncTime overrides are particularly useful during failed discovery troubleshooting. By overriding the default values you can configure the start time of different workflows and isolate discovery problems.
SyncTime (start time) is a property of discoveries and monitors. SyncTime is a string value in the format of "HH:mm". SyncTime, IntervalSeconds and Management Pack Import time together determine the exact run time of a given workflow.
The BaseStartTime attribute can have value in the form of "HH:mm" or integer. "HH:mm" format works as the start time alignment based on which
the cycle repeats. Integer format functions as setting the alignment start time to be the current time plus that many seconds. Be aware that
if you set integer value, every time you rerun the admin task, the cycle start time is recalculated.
The Length attribute specifies the length (in seconds) of each cycle.
The Spacing attribute specifies the spacing time (in seconds) between one workflow's timeout time and the next workflow's start time.
For example, if IntervalSeconds = 21600 (6 hours) and SyncTime = "01:15", the possible run time of the workflow is 1:15AM, 7:15AM, 1:15PM, 7:15PM; if the Management Pack is imported after 1:15AM but before 7:15AM, it will start at 7:15AM, if the Management Pack is imported after 1:15PM but before 7:15PM, it will start at 7:15PM. However, due to other factors such as network delay the actual start time may still vary. Do not change the default SyncTime value unless absolutely required.
So in case you imported the MP at 03:00 PM and the Interval seconds is set to every 8 hrs.=(28,800 seconds) and you configured the sync time to be “03:00” then it will sync at 11:00 PM or 8 hours after the sync time was setup when you imported the MP.
Possible error messages when not configuring this properly are shown below
Example 1

The Event Policy for the process started at 6:46:08 PM has detected errors in the output. The 'StdErr' policy expression: 
.+ 
matched the following output: 
Cycle length 60 is not long enough to ensure the order of workflows 
Please change cycle length to no less than 360 or decrease times, timeout values, and/or spacing 
Command executed: "C:\Windows\system32\cmd.exe" /c powershell.exe -NoLogo -NoProfile -Noninteractive "$ep = get-executionpolicy; if 
($ep -gt 'RemoteSigned') {set-executionpolicy remotesigned} & '"C:\Program Files\System Center Operations Manager 2007\Health Service 
State\Monitoring Host Temporary Files 22\9315\AdminTask.ps1"' 'SharePointMP.Config'" 
Working Directory: C:\Program Files\System Center Management Packs\ 
One or more workflows were affected by this. 
Workflow name: Microsoft.SharePoint.Foundation.2010.ConfigSharePoint 
Instance name: Microsoft SharePoint 2010 Farm Group 
Instance ID: {B7E9A5AF-62D1-CF79-0AE8-044AE7CECBD7} 
Management group: XXX 
Error Code: -2130771918 (Unknown error (0x80ff0032)).

Example 2

The Event Policy for the process started at 6:42:01 PM has detected errors in the output. The 'StdErr' policy expression: 
.+ 
matched the following output: 
Cycle length must be in whole minutes (times of 60) 
Length value 500 is undefined or invalid 
Command executed: "C:\Windows\system32\cmd.exe" /c powershell.exe -NoLogo -NoProfile -Noninteractive "$ep = get-executionpolicy; if 
($ep -gt 'RemoteSigned') {set-executionpolicy remotesigned} & '"C:\Program Files\System Center Operations Manager 2007\Health Service 
State\Monitoring Host Temporary Files 21\9314\AdminTask.ps1"' 'SharePointMP.Config'" 
Working Directory: C:\Program Files\System Center Management Packs\ 
One or more workflows were affected by this. 
Workflow name: Microsoft.SharePoint.Foundation.2010.ConfigSharePoint 
Instance name: Microsoft SharePoint 2010 Farm Group 
Instance ID: {B7E9A5AF-62D1-CF79-0AE8-044AE7CECBD7} 
Management group: XXX 
Error Code: -2130771918 (Unknown error (0x80ff0032)).

User Education - Isolating Discoveries

The following example sets the run time of the discovery to run 5 minutes after running the configuration task for a single workflow that has been failing.

</Annotation>
<WorkflowCycle BaseStartTime="+5" Length="6240" Spacing="15">
<Workflow Id="SPFarm.Discovery" Type="Discovery" Times="1" />

If starting this procedure at 7:35 PM then configure the override as followed when viewed from the Authoring->Management Pack Objects -> Overrides view in the console to start at 7:40 PM
SyncTime Override Value = 19:40
Interval Seconds = 6240

Enable Debug Tracing

Enabling Debug Tracing will enable debug trace on those agent computers that run Windows PowerShell script based discoveries and SPHA monitors. By default it is turned off. When it is enabled, the script based discoveries and monitors will write debug trace information to Event Log in Operations Manager channel on all agent computers, and all the debug trace events have an event ID of 0.
To enable debug tracing do the following:

In the Operations Console Select Monitoring.
Select SharePoint 2010 Products.
Select Administration view.
On the Actions panel, click the task named “Set DebugTrace for SharePoint Management Pack”. A Run Task window will popup.
To enable debug trace (the default option), click Run. To disable debug trace, click Override.
Set the Enabled parameter value to “False” in the popup dialog.
Click Override to close the dialog.
Click Run.
Wait for the task to finish in Task Status window, and then check the Task Output to ensure that the task completes successfully.
Click Close.

How to use debug tracing

Run the “Set DebugTrace For SharePoint Management Pack” task then rerun the Admin Task, and then go to Operations Manager Event channel on the server and check events with ID = 0. Look for the timestamp in the event log and then check the SharePoint ULS trace log to ensure that it is the case
For more information about the ULS trace log, see the SharePoint Foundation 2010 documentation on TechNet (http://technet.microsoft.com/en-us/sharepoint/ee263910.aspx ).

For Advanced Users:

For further troubleshooting of failed discoveries you can use the Operations Manager 2007 R2 Workflow Analyzer which is part of the Operations Manager 2007 R2 MP Authoring Resource Kit http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=18222as
As well as use enabling diagnostic tracing in System Center Operations Manager 2007 http://support.microsoft.com/kb/942864

User Education - Adding Workflows to SharePoint Config file
If you want to add workflows to discover both SharePoint Foundation 2010 and SharePoint 2010 Products
Default <WorkflowCycle BaseStartTime="+300" Length="28800" Spacing="60">
<Workflow Id="WSSInstallation.Discovery" Type="Discovery" Times="1" />
<Workflow Id="SPFarm.Discovery" Type="Discovery" Times="1" />
<Workflow Id="SPService.Discovery" Type="Discovery" Times="4" />
<Workflow Id="SPSharedService.Discovery" Type="Discovery" Times="4" />
<Workflow Id="SPHARule.Discovery" Type="Discovery" Times="1" />
<Workflow Id="SPHARuleMonitor.Availability;SPHARuleMonitor.Security;SPHARuleMonitor.Performance;SPHARuleMonitor.Configuration;SPHARuleMonitor.Custom" Type="Monitor" Times="8" />
<Workflow Id="SPHARuleMonitor.SPServer.Availability;SPHARuleMonitor.SPServer.Security;SPHARuleMonitor.SPServer.Performance;SPHARuleMonitor.
SPServer.Configuration;SPHARuleMonitor.SPServer.Custom" Type="Monitor" Times="8" />
</WorkflowCycle>

Add the following section to the SharePointMp.config file
<Workflow Id="MOSSInstallation.Discovery;WACInstallation.Discovery;SearchExpressInstallation.Discovery;SearchStandardInstallation.Discovery" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="1" />
<Workflow Id="SPService.Discovery" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="4" />
<Workflow Id="SPSharedService.Discovery" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="4" />
<Workflow Id="SPSharedService.Discovery.WAC" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="4" />

The configuration file should now look like this
<WorkflowCycle BaseStartTime="+300" Length="28800" Spacing="60">
<Workflow Id="WSSInstallation.Discovery" Type="Discovery" Times="1" />
<Workflow Id="MOSSInstallation.Discovery;WACInstallation.Discovery;SearchExpressInstallation.Discovery;SearchStandardInstallation.Discovery" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="1" />
<Workflow Id="SPFarm.Discovery" Type="Discovery" Times="1" />
<Workflow Id="SPService.Discovery" Type="Discovery" Times="4" />
<Workflow Id="SPSharedService.Discovery" Type="Discovery" Times="4" />
<Workflow Id="SPService.Discovery" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="4" />
<Workflow Id="SPSharedService.Discovery" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="4" />
<Workflow Id="SPSharedService.Discovery.WAC" management pack="Microsoft.SharePoint.Server.2010" Type="Discovery" Times="4" />
<Workflow Id="SPHARule.Discovery" Type="Discovery" Times="1" />
<Workflow Id="SPHARuleMonitor.Availability;SPHARuleMonitor.Security;SPHARuleMonitor.Performance;SPHARuleMonitor.Configuration;SPHARuleMonitor.Custom" Type="Monitor" Times="8" />
<Workflow Id="SPHARuleMonitor.SPServer.Availability;SPHARuleMonitor.SPServer.Security;SPHARuleMonitor.SPServer.Performance;SPHARuleMonitor.SPServer.Configuration;SPHARuleMonitor.SPServer.Custom" Type="Monitor" Times="8" />
</WorkflowCycle>

=====

For the most current version of this article please see the following:

2690744 : Configuring the SharePoint 2010 Management Pack for System Center Operations Manager

J.C. Hornbeck| System Center & Security Knowledge Engineer

Get the latest System Center news onFacebookandTwitter:

↧

KB: How to troubleshoot Event ID 2115 in Operations Manager

April 17, 2012, 9:35 am

≫ Next: How to get Knowledge Editing to work in Operations Manager 2012 with Office 2010

≪ Previous: KB: Configuring the SharePoint 2010 Management Pack for System Center Operations Manager

Here’s another new Knowledge Base article we published today. This one goes through some common troubleshooting tips for eliminating Event IDs 2115 in Operations Manager:

=====

Symptoms

In Operations Manager, one of the performance concerns surrounds Operations Manager Database and Data Warehouse insertion times. The following is a description to help identify and troubleshoot problems concerning Database and Data Warehouse data insertion.

Examine the Operations Manager Event log for the presence of Event ID 2115 events. These events typically indicate that performance issues exist on the Management Server or the Microsoft SQL Server that is hosting the OperationsManager or OperationsManager Data Warehouse databases. Database and Data Warehouse write action workflows run on the Management Servers and these workflows first retain the data received from the Agents and Gateway Servers in an internal buffer. They then gather this data from the internal buffer and insert it into the Database and Data Warehouse. When the first data insertion has completed, the workflows will then create another batch.

The size of each batch of data depends on how much data is available in the buffer when the batch is created, however there is a maximum limit on the size of the data batch of up to 5000 data items. If the data item incoming rate increases, or the data item insertion throughput to the Operation Manager and Data Warehouse databases throughput is reduced, the buffer will then accumulate more data and the batch size will grow larger. There are several write action workflows that run on a Management Server. These workflows handle data insertion to the Operations Manager and Data Warehouse databases for different data types. For example:

Microsoft.SystemCenter.DataWarehouse.CollectEntityHealthStateChange
Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData
Microsoft.SystemCenter.DataWarehouse.CollectEventData
Microsoft.SystemCenter.CollectAlerts
Microsoft.SystemCenter.CollectEntityState
Microsoft.SystemCenter.CollectPublishedEntityState
Microsoft.SystemCenter.CollectDiscoveryData
Microsoft.SystemCenter.CollectSignatureData
Microsoft.SystemCenter.CollectEventData

When a Database or Data Warehouse write action workflow on a Management Server experiences slow data batch insertion, for example times in excess of 60 seconds, it will begin logging Event ID 2115 to the Operations Manager Event log. This event is logged every one minute until the data batch is inserted into the Database or Data Warehouse, or the data is dropped by the write action workflow module. As a result, Event ID 2115 will be logged due to the latency inserting data into the Database or Data Warehouse. Below is an example Event logged due to data dropped by the write action workflow module:

Event Type: Error
Event Source: HealthService
Event Category: None
Event ID: 4506
Computer: <RMS NAME>
Description:
Data was dropped due to too much outstanding data in rule "Microsoft.SystemCenter.OperationalDataReporting.SubmitOperationalDataFailed.Alert" running for instance <RMS NAME> with id:"{F56EB161-4ABE-5BC7-610F-4365524F294E}" in management group <MANAGEMENT GROUP NAME>.

Event ID 2115 contains 2 significant pieces of information. First, the name of the Workflow that is experiencing the problem and second, the elapsed time since the workflow began inserting the last batch of data.

For example:

Log Name: Operations Manager
Source: HealthService
Event ID: 2115
Level: Warning
Computer: <RMS NAME>
Description:
A Bind Data Source in Management Group <MANGEMENT GROUP NAME> has posted items to the workflow, but has not received a response in 300 seconds. This indicates a performance or functional problem with the workflow.
Workflow Id : Microsoft.SystemCenter.CollectPublishedEntityState
Instance : <RMS NAME>
Instance Id : {88676CDF-E284-7838-AC70-E898DA1720CB}

This particular Event ID 2115 message indicates that the workflow Microsoft.SystemCenter.CollectPublishedEntityState, which writes Entity State data to the Operations Manager database, is trying to insert a batch of Entity State data and it started 300 seconds ago. In this example the insertion of the Entity State data has not yet finished. Normally inserting a batch of data should complete within 60 seconds. If the Workflow Id contains Data Warehouse then the problem concerns the Operations Manager Data Warehouse. Otherwise, the problem would concern inserting data into the Operations Manager database.

Cause

As the description of Event ID 2115 states, this may indicate a database performance problem or too much data incoming from the agents. Event ID 2115 simply indicates there is a backlog inserting data into the Database; Operations Manager or Operations Manager Data Warehouse. These Events can originate from a number of possible causes. For example, a large amount of Discovery data, a Database connectivity issue or full database condition, potential disk or network constraints.

In Operations Manager, Discovery data insertion is a relatively expensive process. We define a burst of data as a short period of time where a significant amount of data is received by the Management Server. These bursts of data can cause Event ID 2115 since the data insertion should occur infrequently. If Event ID 2115 consistently appears for Discovery data collection, this can indicate either a Database or Data Warehouse insertion problem or Discovery rules in a Management Pack collecting too much discovery data.

Operations Manager configuration updates caused by Instance Space changes or Management Pack imports have a direct effect on CPU utilization on the Database Server and this can impact Database insertion times. Following a Management Pack import or a large instance space change, it is expected to see Event ID 2115 messages. For more information on this topic please see the following:

2603913 - How to detect and troubleshoot frequent configuration changes in Operations Manager (http://support.microsoft.com/kb/2603913)

If the Operations Manager or Operations Manager Data Warehouse databases are out of space or offline, it is expected that the Management Server will continue to log Event ID 2115 messages to the Operations Manager Event log and the pending time will grow higher.

If the write action workflows cannot connect to the Operations Manager or Operations Manager Data Warehouse databases, or they are using invalid credentials to establish their connection, the data insertion will be blocked and Event ID 2115 messages will be logged accordingly until this situation is resolved.

In Operations Manager, expensive User Interface queries can impact resource utilization on the Database which can lead to latency in Database insertion times. When a user is performing an expensive User Interface operation it is possible to see Event ID 2115 messages logged.

Event ID 2115 messages can also indicate a performance problem if the Operations Manager Database and Data Warehouse databases are not properly configured. Performance problems on the database servers can lead to Event ID 2115 messages. Some possible causes include the following:

The SQL Log or TempDB database is too small or out of space.
The Network link from the Operations Manager and Data Warehouse database to the Management Server is bandwidth constrained or the latency is large. In this scenario we recommend to Management Server to be on the same LAN as the Operations Manager and Data Warehouse server.
The data disk hosting the Database, logs or TempDB used by the Operations Manager and Data Warehouse databases is slow or experiencing a function problem. In this scenario we recommend leveraging RAID 10 and we also recommend enabling battery backed Write Cache on the Array Controller.
The Operations Manager Database or Data Warehouse server does not have sufficient memory or CPU resources.
The SQL Server instance hosting the Operations Manager Database or Data Warehouse is offline.

It is recommend that the Management Server reside on the same LAN as the Operations Manager and Data Warehouse database server.
Event ID 2115 messages can also occur if the disk subsystem hosting the Database, logs or TempDB used by the Operations Manager and Data Warehouse databases is slow or experiencing a function problem. In this scenario we recommend leveraging RAID 10 and we also recommend enabling battery backed Write Cache on the Array Controller.

Resolution

The first step in troubleshooting Event ID 2115 is to identify what Data Items are being returned within the Event. For example, the Workflow ID indicates what type of Data Items (Discovery, Alerts, Event, Perf) as well as what Database in question. From the Workflow ID we can then determine the corresponding database. If the Workflow ID contains the term DataWarehouse, then the focus should be towards the Operations Manager Data Warehouse. In other instances the focus would be towards the Operations Manager Database.

The seconds indicator in the Event ID 2115 message returns how long the workflow in question has been waiting to insert the data items into the Database.

Scenario 1

In the example Event ID 2115 below, the problem concerns the workflow Microsoft.SystemCenter.CollectSignatureData.

Event Type: Warning
Event Source: HealthService
Event Category: None
Event ID: 2115
Computer: <RMS NAME>
Description:
A Bind Data Source in Management Group <MANGEMENT GROUP NAME> has posted items to the workflow, but has not received a response in 300 seconds. This indicates a performance or functional problem with the workflow.
Workflow Id : Microsoft.SystemCenter.CollectSignatureData
Instance : <RMS NAME>
Instance Id : {F56EB161-4ABE-5BC7-610F-4365524F294E}

Resolution 1

We can identify the Performance Signature Data Collection Rules in this example by executing the following SQL Query. This query should be executed in SQL Management Studio against the Operations Manager database.

-- Return all Performance Signature Collection Rules
Use OperationsManager
select
managementpack.mpname,
rules.rulename
from performancesignature with (nolock)
inner join rules with (nolock)
on rules.ruleid = performancesignature.learningruleid
inner join managementpack with(nolock)
on rules.managementpackid = managementpack.managementpackid
group by managementpack.mpname, rules.rulename
order by managementpack.mpname, rules.rulename

This query will return all Performance Signature Collection Rules and their respective Management Pack name. A column is returned for Management Pack name and Rule name.

The following Performance Monitor Counters on a Management Server will provide information concerning Database and Data Warehouse write action insertion batch size and time. If the batch size is growing larger, for example the default batch size is 5000 items, this indicates either the Management Server is slow inserting the data to the Database or Data Warehouse, or is receiving a burst of Data Items from the Agents or Gateway Servers.

· OpsMgr DB Write Action Modules(*)\Avg. Batch Size
· OpsMgr DB Write Action Modules(*)\Avg. Processing Time
· OpsMgr DW Writer Module(*)\Avg. Batch Processing Time, ms
· OpsMgr DW Writer Module(*)\Avg. Batch Size

From the Database and Data Warehouse write action account Average Processing Time counter, we can understand how long it takes on average to write a batch of data to the Database and Data Warehouse. Depending upon the amount of time it takes to write a batch of data to the Database, this may present an opportunity for tuning.

Scenario 2

The SQL Server instance hosting the Operations Manager Database or Data Warehouse is offline. Event ID 2115 as well as Event ID 29200 appear within the Operations Manager Event log. For example:

Log Name: Operations Manager
Source: HealthService
Date:
Event ID: 2115
Level: Warning
Description:

A Bind Data Source in Management Group MSFT has posted items to the workflow, but has not received a response in 60 seconds. This indicates a performance or functional problem with the workflow.

Workflow Id : Microsoft.SystemCenter.CollectEventData
Instance : name.contoso.local
Instance Id : {88676CDF-E284-7838-AC70-E898DA1720CB}

Log Name: Operations Manager
Source: OpsMgr Config Service
Event ID: 29200
Level: Error
Description:

OpsMgr Config Service has lost connectivity to the OpsMgr database, therefore it can not get any updates from the database. This may be a temporary issue that may be recovered from automatically. If the problem persists, it usually indicates a problem with the database. Reason:

A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)

Resolution 2

To resolve the issue in this scenario, follow these steps:

Connect to the Server hosting the Operations Manager database.
Open the Services applet.
Verify that the SQL Server (MSSQLSERVER) service is started and running.
If the SQL Server (MSSQLSERVER) service is not started and running, start the service.

Once Database connectivity is restored workflows should resume successfully storing data within the respective database. Event ID 31554 validates that the information has been written successfully.

For example:

Log Name: Operations Manager
Source: Health Service Modules
Event ID: 31554
Task Category: Data Warehouse
Level: Information

Description:
Workflow succeeded storing data in the Data Warehouse

One or more workflows were affected by this.

Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectEventData
Instance name: name.contoso.local
Instance ID: {88676CDF-E284-7838-AC70-E898DA1720CB}

Scenario 3

Event ID 2115 caused by invalid RunAs Credentials.

Resolution 3

Examine the Operations Manager Event log for the following events. These events typically indicate that the Data Warehouse SQL Server Authentication Account may have incorrect credentials.

Log Name: Operations Manager
Source: HealthService
Event ID: 7000
Task Category: Health Service
Level: Error
Description:

The Health Service could not log on the RunAs account <ACCOUNT NAME> for management group <MANGEMENT GROUP NAME>. The error is Logon failure: unknown user name or bad password.(1326L). This will prevent the health service from monitoring or performing actions using this RunAs account

Log Name: Operations Manager
Source: HealthService
Event ID: 7015
Task Category: Health Service
Level: Error
Description:

The Health Service cannot verify the future validity of the RunAs account <ACCOUNT NAME> for management group <MANGEMENT GROUP NAME>. The error is Logon failure: unknown user name or bad password.(1326L).

To resolve the issue in this scenario, follow these steps:
1. Open the Operations Manager console.
2. Select Administration.
3. Select Run As Configuration\Accounts.
4. Configure the appropriate credentials for the Data Warehouse SQL Server Authentication Account.

Scenario 4

Event ID 2115 caused by disk performance issues. The data disk hosting the Database, logs or TempDB used by the Operations Manager and Data Warehouse databases is slow or experiencing a functional problem. In this scenario we recommend leveraging RAID 10 and we also recommend enabling battery backed Write Cache on the Array Controller.

Resolution 4

Counters to identify disk pressure

Capture these Physical Disk counters for all drives that contain SQL data or log files:

% Idle Time: How much disk idle time is being reported. Anything below 50 percent could indicate a disk bottleneck.
Avg. Disk Queue Length: This value should not exceed 2 times the number of spindles on a LUN. For example, if a LUN has 25 spindles, a value of 50 is acceptable. However, if a LUN has 10 spindles, a value of 25 is too high. You could use the following formulas based on the RAID level and number of disks in the RAID configuration:

- RAID 0: All of the disks are doing work in a RAID 0 set
- Average Disk Queue Length <= # (Disks in the array) *2
- RAID 1: half the disks are “doing work”; therefore, only half of them can be counted toward Disks Queue
- Average Disk Queue Length <= # (Disks in the array/2) *2
- RAID 10: half the disks are “doing work”; therefore, only half of them can be counted toward Disks Queue
- Average Disk Queue Length <= # (Disks in the array/2) *2
- RAID 5: All of the disks are doing work in a RAID 5 set
- Average Disk Queue Length <= # (Disks in the array/2) *2
- Avg. Disk sec/Transfer: The number of seconds it takes to complete one disk I/O
- Avg. Disk sec/Read: The average time, in seconds, of a read of data from the disk
- Avg. Disk sec/Write: The average time, in seconds, of a write of data to the disk
- Disk Bytes/sec: The number of bytes being transferred to or from the disk per second
- Disk Transfers/sec: The number of input and output operations per second (IOPS)
NOTE The last three counters in this list should consistently have values of approximately .020 (20 ms) or lower and should never exceed.050 (50 ms). The following are the thresholds that are documented in the SQL Server performance troubleshooting guide:
- 1. - Less than 10 ms: very good
    - Between 10 - 20 ms: okay
    - Between 20 - 50 ms: slow, needs attention
    - Greater than 50 ms: serious I/O bottleneck
- Disk Bytes/sec: The number of bytes being transferred to or from the disk per second
- Disk Transfers/sec: The number of input and output operations per second (IOPS)

When % Idle Time is low (10 percent or less), this means that the disk is fully utilized. In this case, the last two counters in this list (“Disk Bytes/sec” and “Disk Transfers/sec”) provide a good indication of the maximum throughput of the drive in bytes and in IOPS, respectively. The throughput of a SAN drive is highly variable, depending on the number of spindles, the speed of the drives, and the speed of the channel. The best bet is to check with the SAN vendor to find out how many bytes and IOPS the drive should support. If % Idle Time is low, and the values for these two counters do not meet the expected throughput of the drive, engage the SAN vendor to troubleshoot.
The following links provide deeper insight into troubleshooting SQL Server performance:

Scenario 5

Event ID 2115 is logged, and a management server generates an "unable to write data to the Data Warehouse" alert in System Center Operations Manager.
You experience the following symptoms on a management server computer that is running Microsoft System Center Operations Manager:

The management server generates one or more alerts that resemble the following:
Performance data collection process unable to write data to the Data Warehouse

Performance data collection process unable to write data to the Data Warehouse. Failed to store data in the Data Warehouse. Exception 'SqlException': Management Group with id '9069F7BD-55B8-C8E8-1CF9-4395F45527E2' is not allowed to access Data Warehouse under login 'DOMAIN\Action_Account' One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData Instance name: dataWarehouseServer.domain.com Instance ID: {48936EE3-4E3E-BEE1-8C09-AFDAB8ECF236} Management group: Management Group Name.
The following event is logged in the Operations Manager event log on the management server:
Event Type: Warning
Event Source: HealthService
Event Category: None
Event ID: 2115
Date: date
Time: time
User: N/A
Computer: ManagementServerName
Description: A Bind Data Source in Management Group Management Group Name has posted items to the workflow, but has not received a response in 1712 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData Instance : ManagementServerName.domain.com Instance Id : {C7FDDE2A-E0AA-4B80-70DE-1D50D9965221}

Resolution 5

This issue may occur if the management server does not have accounts that are specified for its data warehouse "Run As" profiles. This issue is more likely to affect a secondary management server.

To resolve this problem, follow these steps:

On the computer that is running System Center Operations Manager, open the Operations Console.
In the navigation pane, click Administration.
Expand Security, and then click Run As Profiles.
In the Run As Profiles view, double-click Data Warehouse Account.
In the Run As Profile Properties - Data Warehouse Account properties dialog box, click the Run As Accounts tab, and then click New.
In the Run As Account list, click Data Warehouse Action Account.
In the Name list, click the management server that generated the alert.
Click OK two times.
Follow steps 4 through 8 to assign the appropriate Run As account to the following profiles:
Data Warehouse Configuration Synchronization Reader Account
Data Warehouse Report Deployment Account
Data Warehouse SQL Server Authentication Account
For each profile, select the Run As account that matches the name of the Run As profile. For example, make the following assignments:
Assign the Data Warehouse Configuration Synchronization Reader Account to the Data Warehouse Configuration Synchronization Reader Account profile.
Assign the Data Warehouse Report Deployment Account to the Data Warehouse Report Deployment Account profile.
Assign the Data Warehouse SQL Server Authentication Account to the Data Warehouse SQL Server Authentication Account profile.
On the management server that generated the alert, restart the OpsMgr Health Service.
In the Operations Manager event log on the management server, verify that event ID 31554 events are logged.
NOTE Event ID 31554 indicates that the monitor state has changed to Healthy. This change resolves the alert.

Scenario 6

Event ID 2115 occurs on a server running HP MPIO FF DSM XP v3.01 to which there are no LUNs presented. When the user opens Performance Monitor and attempts to add a counter, Performance Monitor will hang and the handle count for this application increases rapidly.

Resolution 6

There are two workaround regarding this issue.

Rename the HPPerfProv.dll file and reboot Windows. Performance Monitor will work without issue when the file is renamed and not loaded.
Have at least 1 LUN present on the system.

For more information on this issue please see the following:

Event ID 2115 is caused by HPPerfProv.DLL
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c01743552&prodTypeId=18964&prodSeriesId=421492

=====

For the most current version of this article please see the following:

2681388 : How to troubleshoot Event ID 2115 in Operations Manager

J.C. Hornbeck| System Center & Security Knowledge Engineer

Get the latest System Center news onFacebookandTwitter:

↧

How to get Knowledge Editing to work in Operations Manager 2012 with Office 2010

October 10, 2012, 1:46 pm

≫ Next: Performance IOPS for the DB and DW in OpsMgr 2007

≪ Previous: KB: How to troubleshoot Event ID 2115 in Operations Manager

If you have tried editing knowledge in OpsMgr 2012 with Office 2010 you are likely to have run into this error “Could not load file or assembly ‘Microsoft.Office.Interop.Word” or as shown in the image below. Working with our support team (Mitch L, Stephen R) we have a temporary work around which works on both x64 and x86 of Office 2010. I need to caveat that this is not something that is officially supported by the product team at this time, but plan to do so in our service pack release.

The Workaround:
Step 1: Install Visual Studio 2005 Tools for Office Second Edition Runtime
Link to download VSTO 2005 SE: http://www.microsoft.com/en-us/download/details.aspx?id=24263

Step 2: Install Visual Studio 2010 Tools for Office Runtime
Link to download VSTO 2010 (64 or 32): http://www.microsoft.com/en-us/download/details.aspx?displaylang=en&id=20479
Note depending on the version of Office 2010 you have installed you will need to install the x64 or x32 version of VSTO 2010.

Step 3:Reboot the OpsMgr Console machine. Note there is not prompt in the VSTO setup requiring you to reboot but this is something you have to do otherwise it will not work.

Step 4: Replace the Knowledge.DOT file found in install path:\\Program Files\System Center 2012\Operations Manager\Console with Knowledge.DOT file attached to this blog

Step 5: Replace the Microsoft.EnterpriseManagement.Monitoring.Console.exe configuration file found in install path:\\Program Files\System Center 2012\Operations Manager\Console with the Microsoft.EnterpriseManagement.Monitoring.Console.exe file attached to this blog

This posting is provided "AS IS" with no warranties, and confers no rights. The opinions expressed here represent my own and not those of my employer. I accept no liability for the content of this blog post, or for the consequences of any actions taken on the basis of the information provided. Using this workaround is at your own risk.

↧

Performance IOPS for the DB and DW in OpsMgr 2007

June 24, 2008, 12:27 pm

≫ Next: You may need to update the agent restart threshold on Exchange 2007 Servers Managed by Operations Manager 2007

≪ Previous: How to get Knowledge Editing to work in Operations Manager 2012 with Office 2010

The performance team had a thread discussing IOPS requirement with a customer who had some really good questions so and I thought I would share the thread with you all. For sizing the Operations Manager Database and Data warehouse I would highly recommend trying out SCCP planning tool which will give you a lot of guidance for sizing your DB’s. Below are some questions we answered to customer on IOPS requirements which I think maybe useful to you. Also one of our MVPs wrote a great article on IOPS requirements for OpsMgr based on the research he had done which can be found here: http://wchomak.spaces.live.com/blog/cns!F56EFE25599555EC!610.entry

1) Does Ops Mgr 2007 have large sprocs and / or queries that are sensitive to timely completion? Do these sprocs and / or queries run on very large data sets (if so, what's the potential range of sizes), and do they branch a lot or do other things that might generate a lot of random disk I/O?

The DB is an integral part of OpsMgr and part of the end to end operations – in that sense, there is no performance impact against OpsMgr due to sprocs and queries. For larger environments like this, one of the largest database queries will happen during a Configuration change (Importing an MP, making an override on common MP, etc). The amount of IO needed to perform this query will depend on the number of instances you are monitoring. That being said, this is not a very frequent operation, unlike say pulling the Active Alerts View in the Operations Console.

2) Do these sprocs and / or queries generally tailor better to small-block random disk I/Os (4 - 8KB), medium block random disk I/Os (16 - 32KB), or large random disk I/Os (64+KB)? Do any of the sprocs tailor better to sequential disk I/O?

SQL Server’s most basic IO happens in 8K Pages. You should see 8K random and sequential IO’s depending on the level of fragmentation of you tables and indexes. OpsMgr has daily tasks to defrag and reindex tables and indexes as needed. For more information, take a look at this document from the SQL Server 2005 Books Online: http://msdn.microsoft.com/en-us/library/ms190969.aspx One advantage of OpsMgr is that we have separated the Operational and Reporting databases. This allows the “long / expensive” queries to run outside of the Operational data, so that running reports doesn’t impact data coming in, unless the databases area on the same SQL server and physical disks).

3) Can you provide additional details about why Ops Mgr 2007 scalability is limited to 6,000 agents? We need to understand what limits Ops Mgr to 6,000 agents, whether it is OS, database, networking, threading, memory, or other limitation?

We recommend 50 UI Consoles & 6000 Agents in a single MG, as it may cause system bottleneck in RMS/DB beyond that point.

4) What is the system bottleneck in the RMS/DB that you refer to? Is it based upon the DB software, lock contention, a disk I/O shortage, something else?

The bottlenecks tend to be Memory and CPU on the RMS for the three OpsMgr services running there (mainly ConfigService), and Database IO on both the OpsDB and OpsDW. Depending on the number of Agents, Consoles, and MP’s installed, these bottlenecks maybe be a bit more or less severe, but they are the main bottlenecks for overall system scalability.

5) What I/O size (4K, 8K, 16K, etc?) must be sustained at 125 IO ops?

Since the pure number of IO’s has a much bigger impact than the size of the IOs, the size is less important. This number (125 random IOs/sec) is a rule of thumb we use, though faster disks with smaller write sizes may exceed this number, though it would be marginal.

6) How was this requirement for 125 IO ops derived (please be as specific as possible)?

This is based on a typical 10K-15K RPM SCSI disk, with completely random reading/writing. (Sequential IOs and even random/sequential IO mixes will be faster.) If it is not clear already, this number gets doubled (250 IOps) when you have a 4-disk RAID 10 array since you get two disks (and their respective mirrors) working in parallel.

7)Does this rule hold for every type of RAID (1, 5, 1+0, etc)?

For RAID 0, it gets multiplied by the number of disks you have in your array. For RAID 10, you’d multiply it by half the number of disks you have in your array (since half of your disks are used for mirroring). For RAID 1, you’d get no improvement in performance (RAID 1 only provides added redundancy – no performance benefit), so the same rule would hold true here. RAID 5 (which we don’t highly recommend, by the way, since it mostly benefits disk reads, but less so for writes), also gets some benefit, but it will be mostly for disk reads.

8) When considering a 14 drive 1+0 array, does it assume 14x125 IO ops, or 7x125 IO ops, does it assume or 14x125 for reads and 7x125 for writes?

Since RAID 1+0 (RAID 10) only uses half of the disks for performance gain (the other half are for redundancy), you would see a 7 X 125 IOps factor for both random reads and random writes. Trayce J informed me that for READS – some controllers & firmware allow the reads to take place across all the disks in the RAID 1+0 array and thus give you potentially 14x125. Other controllers only give you the 7x125.

9) Please provide detail re: how the estimates for typical support of 2,000 agents/server. If the volume of operational data were to remain the same, would the number of management packs influence the number of agents/server? What's the relationship between operational data volume and use of mgmt server resources?

An increased number of management packs puts a greater load on the RMS, but not the MS’s. So, this would not change the number of agents/MS, but may affect the number of agents that can be monitored in the deployment, due to potential bottlenecks on the RMS.

Higher operational data volume requires more management server resources.

Number of Management Packs, and more importantly the number of discovered instances impacts the Memory and CPU usage on the RMS.

Satya Vel

↧

You may need to update the agent restart threshold on Exchange 2007 Servers Managed by Operations Manager 2007

July 31, 2008, 7:09 pm

≫ Next: How to Delete an Operations Manager 2007 Managed Object from a Management Group using PowerShell (Updated PS script which works on PowerShell V2 and above)

≪ Previous: Performance IOPS for the DB and DW in OpsMgr 2007

The Operations Manager 2007 Management Pack tracks memory usage of the MonitoringHost.exe process via the private bytes counter. By default, the agent will be automatically restarted if memory usage for MonitoringHost.exe is above 100MB. On large Exchange 2007 systems, this threshold may be too low. The recommended threshold is 600MB.

This is why:

On Exchange 20007 servers the .Net runtime (CLR) gets loaded into MonitoringHost.exe. The way Windows performance counters work is that to read a performance counter from an application a DLL provided by that application is loaded into the process. Since some of the Exchange services (for example Hub Transport) are written in managed code, the CLR performance counter client is loaded into the MonitoringHost.exe process for reading those counters. The .net runtime performance counter client implementation loads the CLR to handle reading the counters from the managed service.

The CLR handles memory management via a garbage collection system rather than traditional native code where you explicitly allocate and de-allocate memory. In a garbage collection system the application doesn't explicitly release memory. Periodically the garbage collector will run to see what objects are no longer being used and then free up the memory for those objects. The CLR garbage collector is designed to monitor system load so that if there is no memory pressure then it doesn't need to run as often. When the garbage collector runs it will internally release objects and make space for new objects. However, it may not trim the amount of memory it has requested from the operating system if it detects there is no memory pressure on the system. If the system starts seeing memory pressure then the CLR may start decreasing the amount of memory it has asked for.

On large systems like Exchange 2007 servers, Operations Managers default private byte threshold of 100MB may end up being too small. Since there is limited memory pressure on these systems, the CLR may commit more than 100MB of memory and not de-allocate until there is more memory pressure on the system.

Symptoms

Health Service restarts even after installing the following hotfixes (monitoringhost.exe) on X64 Systems running Exchange 2007.

950853 A memory leak occurs when you monitor Exchange Server 2007 by using the MOM 2007 agent in System Center Operations Manager 2007

http://support.microsoft.com/default.aspx?scid=kb;EN-US;950853

951979 Problems occur on a management server that is running System Center Operations Manager 2007 Service Pack 1 when certain management packs are installed

http://support.microsoft.com/default.aspx?scid=kb;EN-US;951979

You'll see the following event being logged in the Operations Manager Event log.

Event Type: Warning

Event Source: Health Service Script

Event Category: None

Event ID: 6026

Date: 6/26/2008

Time: 8:59:55 PM

User: N/A

Computer: SGBD012512

Description:

LaunchRestartHealthService.js : Launching Restart Health Service. Monitoring Host exceeded Process\Private Bytes threshhold.

To update the agent restart threshold, perform the following steps

1. In the authoring section of the console, find the Monitoring Host Private Bytes Threshold Rule for the Agent class

2. Select to override the value for a group. To override for all Exchange 2007 Servers, select the All Computers in Management Pack: Microsoft.Exchange.2007 Computer Group. It is also possible to override this for single instances of agents or your own groups.

3. Set the Threshold to 629145600 (600MB). Note that you should not store the override in the Default Management Pack

4. Perform the same steps for the Health Service Private Bytes Threshold Monitor (again for the Agent class; this will prevent the agent health state from changing to critical)

(Thanks to Marc Reyhner and Brian Zoucha for most of the content of this article)

↧

How to Delete an Operations Manager 2007 Managed Object from a Management Group using PowerShell (Updated PS script which works on PowerShell V2 and above)

September 3, 2008, 2:30 pm

≫ Next: Installing Web Console on Windows Server 2008

≪ Previous: You may need to update the agent restart threshold on Exchange 2007 Servers Managed by Operations Manager 2007

(Updated PS script below which works on PowerShell V2 and above)

We have had some customers ask us if there is a way to remove an agent managed computer using PowerShell from OpsMgr 2007 after the agent managed computer has been turned off. The scenario is that the machine that has an agent installed on it gets de-commissioned for some reason and now the user would like to get rid of all the traces left by that particular agent machine using PowerShell. The attached PowerShell script should take care of this for you. Note: This script is not officially supported by the product team and used only when needed.

You need to specify the FQDN of the machine name after you run the script \DeleteAgent.ps1 satyamachine.vel.net

Why am I am not able to execute custom scripts in PowerShell?

On some PowerShell installs you will not be able to run this script because PowerShell is probably running on “Restricted” mode which means that the scripts need to be signed. You can bypass this by doing the following.

To change the script execution mode from the default RemoteSigned script execution mode, use the Set-Unrestricted cmdlet in the OpsMgr 2007 Command Shell. The OpsMgr 2007 Command Shell recognizes the change to the policy immediately.

Users that want to set a consistent script execution mode for all computers that are running the OpsMgr 2007 Command Shell should apply the script execution mode setting by using an Active Directory group policy. You configure the Active Directory group policy to set the ExecutionPolicy value located under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\PowerShell\1\ShellIds\Microsoft.PowerShell registry key to the desired script execution mode.

Satya Vel | Program Manager | System Center |

↧

Installing Web Console on Windows Server 2008

December 17, 2008, 1:52 pm

≫ Next: Troubleshooting OpsMgr 2007 and OpsMgr 2012 certificate issues with PowerShell

≪ Previous: How to Delete an Operations Manager 2007 Managed Object from a Management Group using PowerShell (Updated PS script which works on PowerShell V2 and above)

Recently, I have seen a bunch of newsgroup questions on installing the Web Console component on Windows Server 2008. Please take a look at the following requirements to get up and running on Server 2008

You must add the Web Server (IIS) role and then install the following server role services:

IIS 6 Metabase Compatibility
IIS 6 WMI Compatibility
Static Content
Default Document
Directory Browsing
HTTP Errors
ASP.NET
.NET Extensibility
ISAPI Extensions
ISAPI Filters
Request Filtering

You must apply the following hotfix http://support.microsoft.com/kb/954049/ to fix the following issue:
The Operations Manager Web Console does not display pages correctly because of changes in User Access Control (UAC) in Windows Server 2008. You may experience one or more of the following symptoms:

The Web Console does not open.
The My Workspace page does not display.
Performance views do not display.

Thanks!!

Rob Kuehfus | System Center Operations Manager | Setup and Deployment Program Manager

↧

Troubleshooting OpsMgr 2007 and OpsMgr 2012 certificate issues with PowerShell

January 23, 2009, 6:35 pm

≫ Next: To Repair or not to Repair, that is the question?

≪ Previous: Installing Web Console on Windows Server 2008

The steps for configuring certificates in Operations Manager are numerous and one can easily get them confused. I see posts to the newsgroups and discussion lists regularly trying to troubleshoot why certificate authentication is not working, perhaps for a workgroup machine or gateway. Sometimes it takes 3 or 4 messages back and forth before I or anyone else can diagnose what the problem actually is but once this is finally done we can suggest how to fix the problem.

In an attempt to make this diagnosis stage eaiser, I put together a PowerShell script that automatically checks installed certificates for the needed properties and configuration. If you think everything is set up correctly but the machines just won't communicate, try running this script on each computer and it will hopefully point you to the issue. I have tried to provide useful knowledge for fixing the problems.

This script is for stand-alone Powershell 1.0 - it does not require the Ops Mgr PowerShell snapins.

Please leave a comment if you find bugs or if the script gives a faulty verdict for you - either it says your setup is fine but actually it's not working OR it says your setup is busted but the machines communicate anyways. I appreciate this feedback and will use it to improve and update the script.

Thanks!

Lincoln

Update 2/6/2009: Added some fixes to the script thanks to feedback from a few sources and a bit of extra testing.

Update 7/1/2009: Fix for computer name checking for workgroup machines

↧

To Repair or not to Repair, that is the question?

April 25, 2009, 2:01 pm

≫ Next: Notification Test Tool

≪ Previous: Troubleshooting OpsMgr 2007 and OpsMgr 2012 certificate issues with PowerShell

I have recently seen many different uses of the agent "Repair…" action when it comes to agent install and upgrade. I have put together the following information with the help of the setup team to guide you on the uses of the repair action.

Summary:

When running agent repair the system basically does a reinstall of the agent using the default action account that was specified during initial install (local system in most cases). Also, if the agent was removed manually from the server, but still present on the console in a "Grayed out" state, using agent repair will cause the agent's health service not to start because it does not know the default action account to use due to being previously uninstalled.

If you were using the AD Integration feature, the agent behavior will be changed back to being managed via the RMS, no longer using AD Integration.

Full details can be found on the OpsManJam Web site.

Link to article - http://www.opsmanjam.com/Lists/OpsManJam%20Announcements/DispForm.aspx?ID=12

Thanks!!

Rob Kuehfus | System Center Operations Manager | Setup and Deployment Program Manager

↧

Notification Test Tool

May 19, 2009, 1:11 pm

≫ Next: What’s new in R2 Deployment and Upgrade

≪ Previous: To Repair or not to Repair, that is the question?

The Notifications Test Tool is a simple tool you can use to test your notification channels and subscriptions. This tool can be used to test email (SMTP), Instant Message (IM) or Text message (SMS) channels. Also, you can use this to generate a test alert in OpsMgr to test subscriptions.

Install Instructions:

Download and install the OMTestTool.msi to a local directory on your computer.

Note: This tool must be installed on a computer that has the OpsMgr console installed and run from someone logged as a local admin. To run the tool you must me a member of the OpsMgr Administrator group.

I have included a small user guide in the zip file. This tool will work on all version of OpsMgr. I plan to update this tool to support multiple channels types that R2 now exposes when I get some time.

Note: If you are running this tool on an OS where UAC is enabled please make sure to right click and run as Administrator.

Enjoy!

Rob Kuehfus | System Center Operations Manager | Program Manager

Disclaimer:

This utility is supplied "as -is" with no support. In addition, my thoughts and opinions often change, and as a weblog is intended to provide a semi-permanent point in time snapshot you should not consider out of date posts to reflect my current thoughts and opinions.

↧

What’s new in R2 Deployment and Upgrade

May 28, 2009, 7:24 pm

≫ Next: Windows Agent Install MSI Use Cases and Commands

≪ Previous: Notification Test Tool

<Update> I have include information on how to upgrade from R2 RC to R2 RTM.

Not much has changed when upgrading from Operations Manager 2007 SP1 to R2 or deploying a fresh installation of Operations Manager 2007 R2. However, I do want to take the opportunity to point out few of the changes.

Changes to supported platforms

Operations Manager R2 now supports OM database roles running on SQL 2008 with SP1. Please be sure install SP1 of SQL 2008 before beginning your R2 install or upgrade. Our Prerequisite checker detects for SQL 2008 SP1and will block until it is found. Note: We still continue to support SQL 2005 with SP1 or higher.

Operations Manager R2 also supports upgrading from SQL 2005 to SQL 2008. Only the Operations Manager Reporting role require special instructions. Detailed instructions can be found in the upgrade guide. Upgrading SQL Server 2005 to SQL Server 2008 Link

Upgrade Support

Operations Manager supports the following upgrade paths:

Agent Mixed Management Group Multi-homed Support

For agents that are multi-homed we support the following:

Upgrading from R2 Trial version (Eval) to R2 RTM (Select)

Upgrading from R2 RC to R2 RTM <New>

When upgrading from R2 RC to R2 RTM please follow the same procedures outlined in the Upgrade guide for upgrading from OM SP1 to OM R2.

http://technet.microsoft.com/en-us/library/dd362729.aspx

R2 Monitored Item Capacity changes from SP1

New Supported Operations Systems for Agent

Windows Server 2008 Foundation, WePOS v1, POSReady, Windows Embedded Standard, Windows Embedded Enterprise, Windows Embedded POS Ready

New Documentations and Guides

Upgrade Guide

Troubleshooting Guide

Deployment Guide

Lastly, we have added some new functionality for adding a RMS cluster to an existing management group. In one of my next blog posts I plan to go into more detail on this topic.

Thanks you!

Rob Kuehfus | System Center Operations Manager | Setup and Deployment Program Manager

↧

Windows Agent Install MSI Use Cases and Commands

July 8, 2009, 3:42 am

≫ Next: How does MP Import work? And how do MPs get down to the HealthService?

≪ Previous: What’s new in R2 Deployment and Upgrade

Recently, I have had a bunch for requests for more detail on installing a Windows Agent or upgrading a Agent from the command line. So here you go!!

Use Case	Command
Fresh Install and set specific action account	msiexec.exe /i \\path\Directory\MOMAgent.msi /qn /lv %temp%\MOMAgentinstall.log USE_SETTINGS_FROM_AD=0 MANAGEMENT_GROUP=<MG_Name>*MANAGEMENT_SERVER_DNS=<MSDNSName> ACTIONS_USE_COMPUTER_ACCOUNT=0ACTIONSUSER=<AccountUser> ACTIONSDOMAIN=<AccountDomain> ACTIONSPASSWORD=<AccountPassword> USE_MANUALLY_SPECIFIED_SETTINGS=1 SET_ACTIONS_ACCOUNT=1
Fresh Install and use local system as action account	msiexec.exe /i \\path\Directory\MOMAgent.msi /qn /lv %temp%\MOMAgentinstall.log USE_SETTINGS_FROM_AD=0 MANAGEMENT_GROUP=<MG_Name>*MANAGEMENT_SERVER_DNS=<MSDNSName> ACTIONS_USE_COMPUTER_ACCOUNT=1USE_MANUALLY_SPECIFIED_SETTINGS=1 SET_ACTIONS_ACCOUNT=1
Fresh Install with AD Integration and set specific action account	msiexec /i \\path\Directory\MOMAgent.msi /qn /lv %temp%mominst.NoGroupSpecified.log USE_SETTINGS_FROM_AD=1USE_MANUALLY_SPECIFIED_SETTINGS=0ACTIONS_USE_COMPUTER_ACCOUNT=0ACTIONSUSER=<AccountUser> ACTIONSDOMAIN=<AccountDomain> ACTIONSPASSWORD=<AccountPassword> SET_ACTIONS_ACCOUNT=1*
Fresh Install with AD Integration and use local system as action account	msiexec /i \\path\Directory\MOMAgent.msi /qn /lv %temp%\ mominst.NoGroupSpecified.log USE_SETTINGS_FROM_AD=1ACTIONS_USE_COMPUTER_ACCOUNT=1USE_MANUALLY_SPECIFIED_SETTINGS=0SET_ACTIONS_ACCOUNT=1*
Upgrade from SP1	msiexec /i \\path\Directory\MOMAgent.msiSET_ACTIONS_ACCOUNT=0 /qn /l*v %temp%\MOMAgentUpgrade.log
Upgrade from R2 Beta	msiexec /i \\path\Directory\MOMAgent.msiSET_ACTIONS_ACCOUNT=0 /qn /l*v %temp%\MOMAgentUpgrade.log
Upgrade from R2 RC	msiexec /i \\path\Directory\MOMAgent.msiSET_ACTIONS_ACCOUNT=0 /qn /l*v %temp%\MOMAgentUpgrade.log
Change Management server name	MsiExec.exe /i \\path\Directory\MOMAgent.msi /norestart /qn /lv %temp%\MOMAgentReinstall.log MANAGEMENT_GROUP= <SameManagementGroup> MANAGEMENT_GROUP_OPERATION=ModifyConfigGroup MANAGEMENT_SERVER_DNS=<NewManagementServerName> REINSTALL*=ALL
Add new MG	MsiExec.exe /i \\path\Directory\MOMAgent.msi /norestart /qn /lv %temp%\MOMAgentAdd.log MANAGEMENT_GROUP=<NewManagementGroup> MANAGEMENT_GROUP_OPERATION=AddConfigGroup MANAGEMENT_SERVER_DNS=<NewManagementServerName> REINSTALL*=ALL Note: If fresh install is already AD enabled, then re-homing configuration is set by AD
Remove old MG	MsiExec.exe /i \\path\Directory\MOMAgent.msi /norestart /qn /lv %temp%\MOMRemove.log MANAGEMENT_GROUP=<OldManagementGroup> MANAGEMENT_GROUP_OPERATION=RemoveConfigGroup MANAGEMENT_SERVER_DNS=<OldManagementServerName> REINSTALL=ALL Note:* If this is the last MG assigned to this agent the agent will also be uninstalled. Note: If fresh install is already AD enabled, then re-homing configuration is set by AD
Re-Home (remove old MG\MS, redirect to new MG/MS)	Step 1: Add new management group \ Management server MsiExec.exe /i \\path\Directory\MOMAgent.msi /norestart /qn /lv %temp%\MOMAgentAdd.log MANAGEMENT_GROUP=<NewManagementGroup> MANAGEMENT_GROUP_OPERATION=AddConfigGroup MANAGEMENT_SERVER_DNS=<NewManagementServerName> REINSTALL=ALL Step 2: Remove old management group \ management server* MsiExec.exe /i \\path\Directory\MOMAgent.msi /norestart /qn /lv %temp%\MOMRemove.log MANAGEMENT_GROUP=<OldManagementGroup> MANAGEMENT_GROUP_OPERATION=RemoveConfigGroup MANAGEMENT_SERVER_DNS=<OldManagementServerName> REINSTALL=ALL Note:* Make sure to add the new management group first before removing the old one. If you remove the old management group first and it’s the last management group assigned to the agent the agent will be uninstalled. Note: If fresh install is already AD enabled, then re-homing configuration is set by AD

USE_SETTINGS_FROM_AD=0

Indicates that management group settings properties will be set on the command line.

MANAGEMENT_GROUP=MG1

Sets the management group the computer will be managed by to MG1.

MANAGEMENT_SERVER_DNS=MS1.Domain1.net

Sets the management server fully qualified domain name (FQDN) to MS1.Domain1.net (). To use a gateway server, enter the gateway server FQDN as MANAGEMENT_SERVER_DNS.

Important

If the computer's DNS and Active Directory names differ, the MANAGEMENT_SERVER_AD_NAME property also needs to be set to the fully qualified Active Directory Domain Services name.

USE_MANUALLY_SPECIFIED_SETTINGS=1

Indicates that management group settings properties will be set on the command line.

SET_ACTIONS_ACCOUNT=1

Indicates that the default action account will be set on the command line.

SECURE_PORT

Sets the health service port number.

ACTIONS_USE_COMPUTER_ACCOUNT=0

Sets the Agent Action account to a specified user account, instead of to Local System.

ACTIONSUSER=AgentAction

Sets the Agent Action account to AgentAction.

ACTIONSDOMAIN= Domain1

Sets the Agent Action account domain to Domain1.

ACTIONSPASSWORD= Password#2007

Sets the Agent Action account password to Password#2007.

Note: Order of parameters does not matter.

Rob Kuehfus | System Center Operations Manager | Setup and Deployment Program Manager

This is supplied "as -is" with no support. In addition, my thoughts and opinions often change, and as a weblog is intended to provide a semi-permanent point in time snapshot you should not consider out of date posts to reflect my current thoughts and opinions.

↧

How does MP Import work? And how do MPs get down to the HealthService?

November 11, 2009, 12:32 pm

≫ Next: Trouble with setup due to SQL not being detected?

≪ Previous: Windows Agent Install MSI Use Cases and Commands

Many people have asked how MP import works and how do MPs get down to the health service. Hopefully this diagram helps explain some of this.

MP import

Satya

↧

Trouble with setup due to SQL not being detected?

October 26, 2010, 4:00 pm

≫ Next: Updated momclean Reskit tool for OM2007R2

≪ Previous: How does MP Import work? And how do MPs get down to the HealthService?

If you have run into issues during setup where you know that you have a supported version of SQL running on your machine but setup in unable to detect it, you may have hit a SQL WMI provider issue. In OpsMgr 2007 R2, this shows up as the SQL Server requirement in the Prerequisite Checker failing.

Setup calls WMI in order to check SQL version to determine if the SQL version meets the prerequisites. Once you have verified that you do in fact have SQL installed on the machine and that it is indeed a supported version of SQL, check if the WMI provider is the the issue by trying to connect to SQL using SQL Server Configuration Manager. You should receive an error that reads:

"Cannot connect to WMI provider. You do not have permission or the server is unreachable. Not that you can only manage SQL Server 2005 and later servers with SQL Server Configuration Manager. Invalid class."

The solution for this issue is to recompile your MOF file, which can be done using the steps detailed in this blog: http://msmvps.com/blogs/martinpoon/archive/2009/11/27/sql-server-configuration-manager-cannot-connect-to-wmi-provider-invalid-class-0x80041010.aspx

Happy deployment!

↧

Updated momclean Reskit tool for OM2007R2

November 16, 2010, 9:58 am

≫ Next: Schedule Maintenance Mode Reskit Tool Info

≪ Previous: Trouble with setup due to SQL not being detected?

I noticed on the forums someone was having trouble using this with the R2 release. Here is an updated version of the momclean tool to use for uninstalling OM2007R2 when traditional methods have failed. Note: None of the command line parameters have changed. All that has been updated it the product GUIDs the tool supports. Also, I plan to get the official release of this in one of the upcoming CUs for R2.

Thanks to Sandip for updating the tool.

Thanks!

Rob Kuehfus | System Center Operations Manager | Program Manager

R2_cleanmom Download

↧

Schedule Maintenance Mode Reskit Tool Info

June 21, 2011, 1:17 am

≫ Next: SCOM 2007 R2 and SP1 now supports SQL Server 2005 SP4

≪ Previous: Updated momclean Reskit tool for OM2007R2

I have a quick update on the Schedule Maintenance Mode RESKIT Tool, also some information \ best practices.

Best Practices:

Do not go beyond 20 Jobs scheduled. Anything over this will start to place too much load on your Root Management Server
Do not schedule more then 20 items in one Job. If you need to go over this please create a group and target this. Note: The MP has a bug that only limits you to select up to six objects. I have attached an updated MP that corrects this issue below.
When scheduling a group make sure to select system.group then select the group. If you select the group itself the tool lists the individual group membership.

Known Issues:

A common issue with our international customers is a date time casting exceptions or the schedule is not landing on the correct day. This tool was designed and tested on EN-US only. I cannot make any promises but I will look into extending this to support other locals in the future. The work around is to install the Windows EN-US language pack.

Updated Schedule Maintenance Mode MP Download Zip file

↧

SCOM 2007 R2 and SP1 now supports SQL Server 2005 SP4

August 10, 2011, 11:50 am

≫ Next: Topology changes in System Center 2012 Operations Manager (Overview)

≪ Previous: Schedule Maintenance Mode Reskit Tool Info

OM Community,

System Center Operations Manager 2007 SP1 and System Center Operations Manager 2007 R2 now supports SQL Server 2005 SP4. Note: We will have the Supported Configuration and a KB article posted in the next few weeks to make this more official, but feel free to go ahead and install it.

For the most part, nothing special needs to be done when install the Operational, Data Warehouse, and Audit Collection databases. But for the Operations Reporting Role you will need to do the following additional steps to complete the SP4 installation.

1. Open Internet Information Services (IIS) Manager (not 6.0) – found under Administrative Tools from Start menu
Within IIS Manager:

Expand local machine connection to see App Pools and Sites

Select Application Pools

Find the app pool created by the Reporting Server installation, which has the Identity column’s value set to the domain account used for the DW Reader account.

Select that app pool and right click, selecting “Advanced Settings” from the context menu

Under the “Process Model” section, change the value for “Identity” from the domain account to “NetworkService”

Click “OK” to close the Advanced Settings dialog and save the changes

With that app pool still selected, click “Recycle” under the “Application Pool Tasks” section of the Actions area to the right

2. Run SQL2005 SP4 – it should now complete successfully
NOTE: At this point, if the Console were opened, Reporting would fail to load

3. Within IIS Manager, reverse the previous process:

Expand local machine connection to see App Pools and Sites

Select Application Pools

Find the app pool created by the Reporting Server installation, which has the Identity column’s value set to “NetworkService”

Select that app pool and right click, selecting “Advanced Settings” from the context menu

Under the “Process Model” section, change the value for “Identity” from “NetworkService” back to the original domain account

Click “OK” to close the Advanced Settings dialog and save the changes

With that app pool still selected, click “Recycle” under the “Application Pool Tasks” section of the Actions area to the right

4. Open the Console and Reporting should load successfully

5. Verify that Reports work as expected

Thanks!

Rob Kuehfus | Program Manager | System Center Operations Manager

↧

Topology changes in System Center 2012 Operations Manager (Overview)

August 22, 2011, 2:18 pm

≫ Next: Setup and Upgrade in System Center Operations Manager 2012

≪ Previous: SCOM 2007 R2 and SP1 now supports SQL Server 2005 SP4

OM Community,

In this blog post, I will explain the changes made to the Operations Manager 2012 infrastructure topology. The purpose of this post is not to do a deep technical explanation on how some of these new features work but more of an overview around the new changes and how they may affect you. Over the next few months, we (Operations Manager Team) plan to blog additional technical details.

First things first(let’s review):

In previous versions, Operations Manager (2007, 2007 SP1, R2) had a parent-child topology, meaning that in a Management Group a Management Server called the Root Management Server (commonly known as the RMS) acted as a parent to one or more secondary Management Servers or Gateways. The RMS has many unique responsibilities in the Management Group (see below)

The RMS provides the following services:

Console access
Role based access control
Distribution of configurations to agents
Connectors to other mgmt systems
Alert notifications
Health aggregation
Group Calculations
Availability
Dependency Monitor
DB Grooming
Enables model based mgmt

The RMS also introduces the following customer challenges:

Performance and scalability bottleneck
Single point of failure (for RMS workloads)
High availability requires clustering

With these kind of challenges it was very important for most IT & Operations teams to ensure their RMS was highly available and easily recoverable in cases of a disaster.

This left them with two options:

Cluster the Root Management Server (see picture)
Promote a secondary Management Server to the RMS

Unfortunately, both of these options created additional complexity and burdens to the IT & Operations teams, Windows clustering is complex to setup and requires additional shared storage. Patching a clustered RMS was cumbersome and prone to creating instability in the Management Group. Promoting a secondary Management Server was a manual process that required the person to run a specialized command line tool then change multiple configuration files and registry keys on the other components like Reporting Server, Web Console, or other Management Servers. Depending on the customers SLA to the business they would implement one or both of these solutions to ensure some level of availability in the Management Group.

Also, by having this single point of failure in the Management Group it has created a bottleneck that limits the scale out numbers around how many Windows Agents, Unix Agents and console a single Management Group can support.

During product planning for OM12, we quickly identified this as one of our highest priorities. By removing the single point of failure we can provide our customers a much better story around High Availability and lower their costs to maintaining a Operations Manager infrastructure. Also, we can scale the Management Group out to support new OM12 features like Network Monitoring and Application Monitoring (APM).

The Major Change (RMS Removal)

After an in-depth investigation we decided to remove the Root Management server role from a Operations Manager topology. As a result, we needed to figure out how to distribute the workloads the RMS performed. This boils down to three things.

SDK Service - Make sure this service is running on all Management servers and that any kind of SDK client (Console, Web Console, Connector, PowerShell) can connect to it.
Configuration Service (responsible for detecting and issuing new configuration to all Health Services in the Management Group). Federate the Configuration Service to each Management Server so they all work together to keep the Management Group up to date.
Health Service - Balance the RMS specific workloads amongst all Management servers in the Management Group and make sure during a failure the work is redistributed.

SDK Service

In OM12, setup sets this service to automatically start on every Management Server during install. We support any SDK client connecting to any Management Server. At Beta, you will need to configure NLB on the SDK Service for automatic failover.

Configuration Service

In order to federate the Config Service we needed to rewrite the config service almost completely. If you remember in OM 2007 versions the RMS always required a huge amount of memory to properly function. One of the main reasons for this was the Config service. You see every time the Config Service starts, it reads the Operational Database and loads its view of the instance space into memory in XML. In larger Management Groups, this file can easily grow to over 6 GB. The Config Service uses this file to compare against the Operational Database to detect changes and issue new configuration to Health Services. Now that every Management Server will have a running active Configuration Service it is not reasonable to store this in memory any longer. Moving forward the Config Service will store this data in a centralized database (Operational db) that all Config Services in the Management Group participate in keeping up to date and utilizing it to detect configurations changes to the instance space. A fantastic benefit that came out of this design is a much faster startup of Config Service. Once the database is initially created, on subsequent starts the Config Service does not need to rebuild this database from scratch and instead just maintains it. Therefore, it starts issuing configuration much sooner after restart. This is a major improvement over OM 2007 versions where in a large management group it could take up to an hour to start issuing configuration to agents.

Health Service (Resource Pools)

To distribute the RMS specific workloads to all management servers, we needed to develop a mechanism for each Health Service on the management server to function independently, while still having awareness of the workloads the other management servers are performing. This helps to ensure we do not get workflow duplication or missed workflows. To achieve this we added a new feature to OM12 called Resource Pool. Resource Pools are a collection of Health Services working together to manage instances assigned to the pool. Workflows targeted to the instances are loaded by the Health Service in the Resource Pool that ends up managing that instance. If one of the Health Services in the Resource pool were to fail, the other Health Services in the pool will pick up the work that the failed member was running. We also use Resource Pools to bring high availability to other product features like Networking and Unix monitoring. In a follow up blog post I will dive into far more detail on how resource pools work and how to tell where things are running.

To distribute the RMS specific workloads we create three resource pools by default.

All Management Servers Resource Pool– We have re-targeted most RMS specific instances and workflows to this pool. By default if a instance does not have a “should manage” relationship set the Configuration Service will assign it to this pool.

Update April 2, 2012:Notifications are no longer managed by the All Management Servers Pool. There is a dedicated Notifications Resource Pool now which is described below.

Notifications Resource Pool– We have re-targeted the Alert Subscription Service instance to this pool. The reason we did not use the “All Management Servers Resource Pool” was so you can easily remove the management servers from the pool that should not be participating in Notifications. For example, you may have three management servers but only one SMS modem. You would remove the other management servers from the pool so the Notifications workflows do not run where no modem is present.
AD Assignment Resource Pool – Again we have re-targeted the AD Integration workflows to this pool so you can more easily control the location around where the AD assignment workflows will be running.

Notice in the screen capture above we have a column called Membership and it is set to “Automatic” for the default pools. This means all management servers in the management group are automatically a member of these pools. In order to change this you need to open PowerShell and run a PowerShell command (see below).

Get-SCOMResourcePool –Name “AD Assignment Resource Pool” | Set-SCOMResourcePool –EnableAutomaticMembership $FALSE

Now I can right click properties of the “AD Assignment Resource Pool” and modify the management server membership. Note: New management servers added to the Management Group will no longer be members of this resource pool automatically.

At this point you may be wondering about workflows targeted to the RMS that are outside of the OpsMgr product groups control (other management packs from different Microsoft teams or third party vendors). In order for us to not to break backwards compatibility and provide support for legacy management packs we decided to leave the Root Management Server instance and add a special role to one of the management servers in the Management group called the RMS Emulator. This RMS Emulator is only for backwards compatibility to legacy management packs and is in no way required for the management group to function correctly.

You can easily tell which management server is the RMS Emulator by opening the Console and navigating to the Management Servers view in the Administration space. We have added a new column called “RMS Emulator”. By default the first management server installed in the management group is the RMS Emulator. When upgrading to OM12 the former RMS is the RMS Emulator. Note: When upgrading from a secondary management server using the UpgradeManagementGroup switch the RMS Emulator is the management server you are running this from. On a follow-up blog post we will dive into more detail on setup and upgrade changes.

We have provided PowerShell cmdlets to move the RMS Emulator from one management server to another incase the management server acting as the RMS Emulator where to fail.

• To identify the current RMS Emulator in PowerShell

get-SCOMRMSemulator

•Move to the another Management Server

–First assign the new RMS Emulator management server to a variable

$MS = get-scommanagementserver –Name <FQDN of Management Server>

Set-SCOMRMSEmulator $MS

•Delete the RMS Emulator

Remove-SCOMRMSEmulator

–Type “Y” to approve

–Run get-SCOMRMSemulator to validate it is removed. You should see a message that says the RMS Emulator Role not found.

•Add RMS Emulator role to the MG

–First assign the new RMS Emulator management server to a variable

$MS = get-scommanagementserver –Name <FQDN of Management Server>

Set-SCOMRMSEmulator $MS

–Run “get-SCOMRMSEmulator” to verify its been created

Design Considerations

A few things to keep in mind when planning your OM12 Management groups with the topology changes.

Due to the introduction of Resource Pools it is recommended that all management server have no more then 5ms latency between them. This means that if you are currently using management servers in multiple datacenters or sites we recommend you move all management servers to a single data center and use Gateway servers at the other sites.
Moving forward the Product Group recommendation will always be to have two management servers in a Management Group at all times. By doing this you will always have High Availability for your management group and a much easier recovery during a disaster.

I hope this post has provided you with a lot of information to get you started on designing a Operations Manager 2012 topology. The next post in our series will be about the Setup and Upgrade changes.

Thanks

Rob Kuehfus | Program Manager | System Center

Disclaimer

This posting is provided "AS IS" with no warranties, and confers no rights. Use of included utilities are subject to the terms specified at http://www.microsoft.com/info/copyright.htm.

↧