Analysing SCCM Logs with Log Parser Studio

Posted on June 21, 2014

Microsoft System Centre Configuration Manager (SCCM) is used in an Active Directory environment to amongst other things deliver patches and antivirus to the servers and workstations.

In a large environment you can have quite a number of SCCM servers operating in tandem to deliver the data to the client devices. Obtaining details in an SCCM environment of exactly which endpoints are being served by which servers (or how busy each server is) isn’t quite as straightforward as one might imagine.  The client devices all connect to the servers through HTTP calls to collect the packages they require. The IIS logs from the SCCM servers can be downloaded and analysed to try to figure out what is happening in the environment.

The IIS logs contain a number of useful fields (a sample included below) :

LogFilename : E:\logs\SCCMP101\SCCMP101\W3SVC1\u_ex140410.log
LogRow : 5
date : 41739
time : 36526.7730902778
c-ip : 10.75.xx.xx
cs-username :
s-sitename :
s-computername :
s-ip : 10.98.xx.xx
s-port : 80
cs-method : GET
cs-uri-stem : /SMS_MP/.sms_aut
cs-uri-query : MPLIST
sc-status : 200
sc-substatus : 0
sc-win32-status : 0
sc-bytes : 608
cs-bytes : 124
time-taken : 15
cs-version :
cs-host : sccmp101.xx.net
cs(User-Agent) : SMS+CCM

Further fields such as Cookie, Referer events and process types are present but I found these to be generally blank.  The above example includes the data transferred (sc-bytes and cs-bytes) which were not turned on by default and which I found quite useful. These can be activated in IIS easily enough.

In my use case I obtained the logs from 83 servers which amounted to 2859 files over 254 folders coming to 122GB (uncompressed). The proves to be a little bit of a challenge when I don’t have SQL server installed on my PC or Laptop, MS Access is limited to 2GB database and even SQL Express 2014 is limited to 10GB.

I had previously heard of (but not used) Microsoft Log Parser. A quick search revealed version 2.2 (released some 9 years ago in 2005 – but don’t let that put you off) available for download. Now this is a pretty nifty tool as it understands log files in many different formats (and even plain text files). This saves you from having to clean up the log files to strip out headers (which are written to the logs every time the web server starts / restarts) and from having to combine many different files. A real time-saver.

You can then write SQL-like queries and have them executed against your log files to get your results. Now with data of the size above on my 8 Gig i7 870 @ 2.93GHZ running off a Seagate 2TB 7200rpm SATA drive it takes around 3.5 hours to run a query (know how to speed this up fundamentally do share).  Using Task Manager to monitor the execution of the query shows CPU utilisation of only around 8% (one thread sits at about a 50% utilisation) memory utilisation of between 50MB upwards (as the result set grows) and disk speed varying from about 8MB/s to 30MB/s. So not quite sure where the bottleneck lies.

Writing the queries at the command line is a little bit of a pain so the next bit of genius is the Log Parser Studio.  Unlike Log Parser, the studio is kept up to date, with the latest build being 2.0.0.100 from 23 May 2014. The studio provides a library of queries (over 170)  covering logs from ActiveSync, IIS, Windows event logs, Exchange, Outlook Web access amongst others. Covers a huge number of ideas for useful analysis and provides the queries to do it for you.  What is great is that you can use these, modify them for your own purposes or create your own from scratch, and add them all to your own personal library.

For example, to understand what the patterns of access look like over a month a query such as this can be pretty useful.

/* Server total time and data transferred by date */
SELECT s-ip,date,count (s-ip),sum(time-taken),sum(sc-bytes),sum(cs-bytes)
FROM '[LOGFILEPATH]'
GROUP BY s-ip,date

A challenge that you quickly come across is that both the Log Parser and the Studio (which is dependent on the Parser) are 32 bit applications so you need to be careful as to which fields and how much summarisation is included in the result set. If the results grow to quickly then the query will crash as it runs out of memory. The trick is to find a balance between too much information (it crashes) and too little information (need to run many queries). Finding the right balance means further analysis can be done in Excel or Access. I have found that 30 000+ rows still works if few enough fields are chosen and some smart queries are used.

When executing the above query a problem is that the bytes transferred exceeds the value that can be stored in the data type, so you end up with negative numbers for some of the lines returned. Rewriting the query as follows assists in resolving this problem :

/* Server total time and data transferred by date */
SELECT s-ip,date,count (s-ip),sum(div(time-taken,60)),sum(div(sc-bytes,1024)),sum(div(cs-bytes,1024))
FROM '[LOGFILEPATH]'
GROUP BY s-ip,date

Log parser supports a huge number of different functions but you need to know the format to use them. Take a look here for a listing and examples : http://logparserplus.com/Functions

So whereas I would have expected sum(time-taken/60) to give me a result in minutes it fails with an unknown field. Even (sum((time-taken)/60) fails. Check the function reference shows that log parser wants it as sum(div(time-taken,60)) and then life is happy again. Running your query again having just spent 3 hours waiting for the last one to complete – a little less so.

Using these tools and queries I was then able to summarise down the 160 gig of source data into a few thousand rows that could be imported into Access and joined to CMDB data to produce really useful intelligence that can be analysed directly in Access, Excel or Gephi. Thanks Microsoft for these great free products.

I was also looking for further pre-built queries for Log Parser Studio but was unable to find any. If you know where such may be found, please do share. If there is interest and as I use the tool more I will be happy to share anything I have created.

 

 

 

 

 

 

 

 

Visualising Security Data : SCCM patching traffic flows

Posted on March 03, 2014

I have been experimenting a little recently with visualisation of security data.

We have had some challenges with SCCM and needing to understand which clients were connecting to which servers, where and why. This data seemed very hard to come by and after some discussions with some helpful Microsoft South Africa folk the service provider pulled the IIS logs from most of our SCCM Primary Servers and the Distribution Points.

I then added in a Destination column (being the server from which the log was pulled) and combined the logs from all of the servers (6 Primary and 6 Distribution). In MS Access I then summarised the data by source and destination pairs, providing 13952 connections. This was exported as a CSV and headings added in using Notepad (Gephi wouldn’t read the data file without headings named to its liking).

The data was then loaded into Gephi as edge data. I then searched for each of the 12 servers in the node table, added in a Label, changed the colour and size (Red 30 for Primary Blue 20 for secondary), selected the Force Atlas option and let it plot my data. The 13 546 nodes and 113952 were then plotted providing the graph below (when exported as PDF).

The graph was somewhat unexpected in that I did not foresee so many of the workstations being served from Primary servers nor so many devices receiving data from multiple servers. A few of the DP’s (top and bottom of screen) clearly are not serving the numbers of workstations we would expect and need deeper investigation.

While Excel cross tabs and more detailed access queries provide more detailed insight into what is going on this visualisation very quickly demonstrates a very different picture to which the service provider running the SCCM infrastructure had been describing.

Have you done anything similar? Please do share.

 

Map of SCCM links using Gephi

Map of SCCM links using Gephi

 

Download the PDF version here :  map of sccm v2

 

%d bloggers like this: