How to compare multiple files?

The context

For a project, we had to create over 100 virtual machines (VM), each one with specific caracteristics: OS, # CPU, RAM, # disks, disk space... About 1/3 VM are Windows, 2/3 Linux.
I maintain a MS Access Database with the requested VM.
The project is near ending; for quality control, I need to validate what is delivered vs. what was requested. My sysadmins went on almost(*) all the VM to get the caracteristics; the result is 2 text files, one for Windows and one for Linux.
*: This is important. I don't have all the VM yet, hence I will have to do this process again; this have to be taken into account in my solution.

How to compare the data?

Method 1: by hand

Boring, tedious and highly error prone. Forgotten.

Method 2: using Excel

I tried that first. But I'm not a king on Excel, and because the caracteristics are quite different between Linux and Windows, is't hard to make comparisons. Abandoned.

Method 3: using Access

This is actually the most logical way, because my initial source of information is already in Access. Moreover, I do SQL for many years, so I feel comfortable to compare the servers with it (despite Access's SQL is somehow different from standard).
However, because the caracteristics are different between each server, a import from the text file did not yield good results.
But wait! Access is able to manage XML... let's try that.

What I've done?

  1. I've merged the 2 text files
  2. I've transform the raw text file into XML with a basic text editor; tedious, and I made a lot of errors, but with the help of online XML syntax checker, I've ended with a clean file.
  3. Import the XML to Access: encountered a first problem
Problem #1: attributes are badly managed
My XML file was similar to this:
  <server kernel="..." name="..." os="..." ..>
       <fs mount="/dev" name="devtmpfs" size="5.8G" ... >
However, using attributes was a bad idea: Access doesn't seem able to manage them and I ended with empty fields in the result table:
To overcome this, I turned the attributes into elements (easy, using the Find & Replace function of Notepad):
 <server> <name>...</name> <os>...</os> <ci>...</ci> ...
     <fs> <name>devtmpfs</name> <size>12G</size> <mount>/dev</mount> ... </fs>
Problem #2: no common key between the generated tables
Access ended with 2 tables: server and filesystems. However, no primary key is defined in server, but worse no foreign key is defined in filesystems!
The solution: transforming the XML during the import. Access allows to specify a XSLT file to transform the XML.
XML source --> XSLT transformation --> Access tables
Here is my XSLT:
<xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet version="1.0" xmlns:xsl="">
    <xsl:output indent="yes" omit-xml-declaration="yes" />
    <xsl:strip-space elements="*" />
    <xsl:template match="node()">
        <xsl:apply-templates select="node()">
    <xsl:template match="fs">
        <ci><xsl:apply-templates select="../ci"></ci>
  </xsl:stylesheet >

Update: gathering data with Ansible

I now have access to an Ansible console; this will allow me to gather the data on a regular basis. Sadly, Ansible output format is JSON, and not XML. However, JSON can easily be transformed to XML with online tools such as


Popular Posts