Skip to main content

How to compare multiple files?

The context

For a project, we had to create over 100 virtual machines (VM), each one with specific caracteristics: OS, # CPU, RAM, # disks, disk space... About 1/3 VM are Windows, 2/3 Linux.
I maintain a MS Access Database with the requested VM.
The project is near ending; for quality control, I need to validate what is delivered vs. what was requested. My sysadmins went on almost(*) all the VM to get the caracteristics; the result is 2 text files, one for Windows and one for Linux.
*: This is important. I don't have all the VM yet, hence I will have to do this process again; this have to be taken into account in my solution.

How to compare the data?

Method 1: by hand

Boring, tedious and highly error prone. Forgotten.

Method 2: using Excel

I tried that first. But I'm not a king on Excel, and because the caracteristics are quite different between Linux and Windows, is't hard to make comparisons. Abandoned.

Method 3: using Access

This is actually the most logical way, because my initial source of information is already in Access. Moreover, I do SQL for many years, so I feel comfortable to compare the servers with it (despite Access's SQL is somehow different from standard).
However, because the caracteristics are different between each server, a import from the text file did not yield good results.
But wait! Access is able to manage XML... let's try that.

What I've done?

  1. I've merged the 2 text files
  2. I've transform the raw text file into XML with a basic text editor; tedious, and I made a lot of errors, but with the help of online XML syntax checker, I've ended with a clean file.
  3. Import the XML to Access: encountered a first problem
Problem #1: attributes are badly managed
My XML file was similar to this:
<servers>
  <server kernel="..." name="..." os="..." ..>
    <filesystems>
       <fs mount="/dev" name="devtmpfs" size="5.8G" ... >
       ...
    </filesystems>
   </server>
   ...
</servers>
However, using attributes was a bad idea: Access doesn't seem able to manage them and I ended with empty fields in the result table:
To overcome this, I turned the attributes into elements (easy, using the Find & Replace function of Notepad):
<servers>
 <server> <name>...</name> <os>...</os> <ci>...</ci> ...
   <filesystems>
     <fs> <name>devtmpfs</name> <size>12G</size> <mount>/dev</mount> ... </fs>
     ....
   </filesystems>
 </server>
 ....
<servers>
Problem #2: no common key between the generated tables
Access ended with 2 tables: server and filesystems. However, no primary key is defined in server, but worse no foreign key is defined in filesystems!
The solution: transforming the XML during the import. Access allows to specify a XSLT file to transform the XML.
XML source --> XSLT transformation --> Access tables
Here is my XSLT:
<xml version="1.0" encoding="UTF-8"?>
  <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" omit-xml-declaration="yes" />
    <xsl:strip-space elements="*" />
    <xsl:template match="node()">
      <xsl:copy>
        <xsl:apply-templates select="node()">
      </xsl:copy>
    </xsl:template>
    <xsl:template match="fs">
      <fs>
        <ci><xsl:apply-templates select="../ci"></ci>
      </fs>
    </xsl:template>
  </xsl:stylesheet >

Update: gathering data with Ansible

I now have access to an Ansible console; this will allow me to gather the data on a regular basis. Sadly, Ansible output format is JSON, and not XML. However, JSON can easily be transformed to XML with online tools such as https://www.freeformatter.com/json-to-xml-converter.html

Comments

Popular posts from this blog

Drive replacement for Fostex DMT8-vl

The IDE hard drive on my Fostex DMT8-vl multitrack recorder shows signs of its imminent death; when getting hot, I could not record anymore. Must be said this drive comes from an old Sun Station, and has been replaced because I/O failures were detected by Solaris. It worked at least 5 years in my recorder: not so bad. However, time is now to replace it. The DMT8-vl is not able to handle drives bigger than 8.4 GB. Well, it is able to (the current drive is 15 GB), but only 8.4 GB will be usable. My tought was to use a 8 GB CompactFlash; having no moving parts means no noise, which is quite temptating for a music recording device. I purchased a CompactFlash-IDE adapter on the internet (8$) and I had to build a male-male IDE cable adapter (4$). Unfortunately, this doesn't work. The drive is correctly discovered by the operating system, which proposes to format it ("format IDE?"). After answering "yes", the formating runs pretty fast (faster than on a real drive), ...

Samba: Clients get "system error 1223" (or 123) after a server reboot

Facts: a Linux+Samba server shares anonymously a folder. After a reboot, Win clients could not attach the share drive anymore. C:\>net use \\mylinux\folder Enter the user name for 'mylinux': System error 1223 has occurred. The operation was canceled by the user. C:\>net view \\mylinux\ System error 123 has occurred. The filename, directory name, or volume label syntax is incorrect. The process are present, and tcpdump doesn't provide much information. What's going on? After hours of headscratching, the light came: the firewall was on and no rules for the Samba protocol! Grrr!

Issue with Soundpool MO4

I have a Atari STe with a Soundpool MO4 MIDI extension. It used to work very well, but unfortunatelly doesn't anymore: Cubase still detects it, and I can output MIDI to it but nothing is coming out from any MIDI Out. It took me a while to tackle it (lack of time, lack of tool, other items to play with), but I gave a glance last week-end. The parallel port on the Atari uses only the following signals: Pin 1 : Strobe (Atari -> MO4) Pin 2 : Data 0 (Atari -> MO4) Pin 3 : Data 1 (Atari -> MO4) Pin 4 : Data 2 (Atari -> MO4) Pin 5 : Data 3 (Atari -> MO4) Pin 6 : Data 4 (Atari -> MO4) Pin 7 : Data 5 (Atari -> MO4) Pin 8 : Data 6 (Atari -> MO4) Pin 9 : Data 7 (Atari -> MO4) Pin 11: Busy (MO4 -> Atari) The MO4 also decodes few other pins, but since the Atari doesn't, my guess is the MO4 was also targeted for PC. Inside the box, the MO4 is architectured around a CPLD (IspLSI1016 from Lattice) which contains the logi...